2019-12-26 14:45:42 -05:00
// Copyright Epic Games, Inc. All Rights Reserved.
2015-05-11 20:04:15 -04:00
# include "GlobalDistanceField.h"
Copying //UE4/Dev-Build to //UE4/Dev-Main (Source: //UE4/Dev-Build @ 3209340)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3209340 on 2016/11/23 by Ben.Marsh
Convert UE4 codebase to an "include what you use" model - where every header just includes the dependencies it needs, rather than every source file including large monolithic headers like Engine.h and UnrealEd.h.
Measured full rebuild times around 2x faster using XGE on Windows, and improvements of 25% or more for incremental builds and full rebuilds on most other platforms.
* Every header now includes everything it needs to compile.
* There's a CoreMinimal.h header that gets you a set of ubiquitous types from Core (eg. FString, FName, TArray, FVector, etc...). Most headers now include this first.
* There's a CoreTypes.h header that sets up primitive UE4 types and build macros (int32, PLATFORM_WIN64, etc...). All headers in Core include this first, as does CoreMinimal.h.
* Every .cpp file includes its matching .h file first.
* This helps validate that each header is including everything it needs to compile.
* No engine code includes a monolithic header such as Engine.h or UnrealEd.h any more.
* You will get a warning if you try to include one of these from the engine. They still exist for compatibility with game projects and do not produce warnings when included there.
* There have only been minor changes to our internal games down to accommodate these changes. The intent is for this to be as seamless as possible.
* No engine code explicitly includes a precompiled header any more.
* We still use PCHs, but they're force-included on the compiler command line by UnrealBuildTool instead. This lets us tune what they contain without breaking any existing include dependencies.
* PCHs are generated by a tool to get a statistical amount of coverage for the source files using it, and I've seeded the new shared PCHs to contain any header included by > 15% of source files.
Tool used to generate this transform is at Engine\Source\Programs\IncludeTool.
[CL 3209342 by Ben Marsh in Main branch]
2016-11-23 15:48:37 -05:00
# include "DistanceFieldLightingShared.h"
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3357411)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3244756 on 2017/01/03 by Marcus.Wassmer
Copying //Tasks/UE4/Dev-Niagara@3244743 to Dev-Rendering (//UE4/Dev-Rendering)
Change 3248667 on 2017/01/05 by Olaf.Piesche
Resaving default asset because of engine verison issue; maybe unnecessary, but resaving niagara engine content to be sure
#jira UE-40160
Change 3249324 on 2017/01/06 by Marcus.Wassmer
Resave with an actual version to stop cook warning
Change 3249611 on 2017/01/06 by Marcus.Wassmer
Just remove warning-causing niagara data for now.
Change 3308052 on 2017/02/16 by Rolando.Caloca
DR - Check for Vulkan SDK, and only use it if it's newer or the same as the headers we distribute
Change 3308109 on 2017/02/16 by Rolando.Caloca
DR - Upgrade glslang to 1.0.39.1
Change 3308111 on 2017/02/16 by Rolando.Caloca
DR - Update Vulkan distribution to 1.0.39.1
Change 3308153 on 2017/02/16 by Rolando.Caloca
DR - Updated glslang libs
Change 3308842 on 2017/02/17 by Rolando.Caloca
DR - Fixed copy/paste
Change 3310007 on 2017/02/17 by Chris.Bunner
Back out CL 3221219 - causing MIC generation issues and superseded by CL 3273971.
#jira UE-37792
Change 3310154 on 2017/02/17 by Chris.Bunner
Assert when attempting to add a custom material attribute already in the base attributes list.
Change 3310155 on 2017/02/17 by Chris.Bunner
PR #3231: Validate material index before accessing (Contributed by projectgheist)
#jira UE-41774, UE-41788
Change 3310162 on 2017/02/17 by Chris.Bunner
PR #3252: Added MobileMaterialInterface to UsedMaterials (Contributed by projectgheist)
#jira UE-41823, UE-41950
Change 3310176 on 2017/02/17 by Chris.Bunner
Merging CL 3233886: AMD HDR support (requires r.AMDSupportsHDRDisplayOutput=1 in ini).
Update to AGS 5.0.5.
Partial code tidy up.
Change 3310187 on 2017/02/17 by Chris.Bunner
Preserve constant expressions rather than always casting after translating a material attribute. Losing the notion of constant means we can't correctly detect used properties and falsely enable e.g. PDO. Happened because of the incorrect component masks in BreakMaterialNodes which then had to be downcast to the correct type which is done as an inline fragment rather than swizzle expression.
#jira UE-41594
Change 3310215 on 2017/02/17 by Chris.Bunner
Prevent SpeedTree node compiling for skeletal meshes (not supported as uses more UV sets than available).
More descriptive error for missing Cubemap UV input on TextureSample material node .
#jira UE-33098
Change 3310838 on 2017/02/18 by Joe.Graf
Moved some private functions to public for a licensee
#CodeReview: matt.kuhlenschmidt
#rb: n/a
Change 3311876 on 2017/02/20 by Rolando.Caloca
DR - Expose skin cache cvar r.SkinCache.AccumulationBufferSizeInKB
#jira UE-42014
Change 3314139 on 2017/02/21 by Rolando.Caloca
DR - Minor cleanup pass
- Remove FVulkanPendingState
- Renamed some classes for clarity
- Hoist pending UAVs for flush out to pending compute state
Change 3314642 on 2017/02/21 by Rolando.Caloca
DR - Some more renaming
Change 3315431 on 2017/02/21 by Ben.Salem
Properly set default values for test time out and tick. We now will default to ticking once per second, and tracking the macro stats of GPU/Render/Game thread time.
#tests Ran showdown demo several times
Change 3316710 on 2017/02/22 by Rolando.Caloca
DR - hlslcc - Fix refract intrinsic
Change 3316718 on 2017/02/22 by Rolando.Caloca
DR - hlslcc - Built libs to pick up change from 3316710 - refract fix
Change 3316820 on 2017/02/22 by Benjamin.Hyder
updating Tm-TrigNodes map
Change 3317192 on 2017/02/22 by Benjamin.Hyder
Updating QA-Decals map
Change 3317528 on 2017/02/22 by Benjamin.Hyder
Updating QA-Decals map
Change 3317639 on 2017/02/22 by Benjamin.Hyder
Updating Decal on Complex Mesh example in QA-Decals
Change 3317764 on 2017/02/22 by Benjamin.Hyder
Final updates to QA-Decals
Change 3318319 on 2017/02/22 by Rolando.Caloca
DR - minor reorg/rename
Change 3318379 on 2017/02/22 by Rolando.Caloca
DR - more cleanup
Change 3321181 on 2017/02/24 by Rolando.Caloca
DR - Fix GL bug
Change 3321247 on 2017/02/24 by Rolando.Caloca
DR - Fix misc bugs
Change 3321898 on 2017/02/24 by Chris.Bunner
Only issue clear TLV dispatch if required.
#jira UERNDR-193
Change 3321904 on 2017/02/24 by Chris.Bunner
Added comment for potential future optimization.
Change 3322013 on 2017/02/24 by Uriel.Doyon
Fixed separate translucency being affected by Gaussian DOF
#jira UE-40489
Change 3322517 on 2017/02/24 by Uriel.Doyon
Fixed issue with InvestigateTexture command removing budget limit.
Fixed StreamingBounds show flag not working. It nows shows the streaming bound for the currently selected textures.
#jira UE-40485
Change 3323470 on 2017/02/27 by Chad.Garyet
Removing DDC job from dev-rendering
Change 3323479 on 2017/02/27 by Chad.Garyet
Removing RDU agent type
Change 3323519 on 2017/02/27 by Chad.Garyet
removing NCL/LHR/SEA agent types to clean up space
Change 3323639 on 2017/02/27 by Benjamin.Hyder
More updates to QA-Decals
Change 3324207 on 2017/02/27 by Uriel.Doyon
Fixed typo ScaleTexturesByGlobalMyBias -> ScaleTexturesByGlobalMipBias
Removed bad merge in FStreamingTextureLevelContext::GetBuildDataIndexRef
Change 3324396 on 2017/02/27 by Uriel.Doyon
Fixed an issue with the Streaming Bounds show flag interferring with the static level data initialization
#jira UE-40485
Change 3325227 on 2017/02/28 by Chris.Bunner
Fix-up AMD AGS libs.
Change 3325566 on 2017/02/28 by Uriel.Doyon
Fixed possible out-of-bound access in GetUsedTexture() when passing ERHIFeatureLevel::Num
Change 3326009 on 2017/02/28 by Uriel.Doyon
Better fix for 3325566, as the previous fix would ignore the material instance overrides.
Change 3327058 on 2017/03/01 by Benjamin.Hyder
Preparing TM_Shadermodels map for automation
Change 3328222 on 2017/03/01 by Chris.Bunner
Prevent decals from drawing in separate translucency pass. Whilst user control and material relevance were already removed, if the flag was checked before being disabled (by swapping to decal domain) this was still being read in the render loop, now explicitly ignores decals.
#jira UE-42449, UE-42446
Change 3329848 on 2017/03/02 by Uriel.Doyon
Added some extra logs to help track UE-42168
Change 3329977 on 2017/03/02 by Rolando.Caloca
DR - Fix bad clear value
Change 3330008 on 2017/03/02 by Benjamin.Hyder
More preparations for QA-Decals automation
Change 3330754 on 2017/03/02 by Daniel.Wright
Prominent comment explaining reflection env async compute usage and why it's not overlapped with anything
Change 3331451 on 2017/03/03 by Marc.Olano
Manually unroll simplex noise loop to avoid PSO bug on AMD/Metal
Change 3331839 on 2017/03/03 by Rolando.Caloca
DR - hlslcc - add missing file to project
Change 3332247 on 2017/03/03 by Rolando.Caloca
DR - Fix for integrated intel
PR #3305
#jira UE-42393
Change 3332259 on 2017/03/03 by Rolando.Caloca
DR - Fix bad index into pixel formats
PR #3237
#jira UE-41855
Change 3332305 on 2017/03/03 by Rolando.Caloca
DR - OpenGL SRV for index buffers
PR #3271
#jira UE-32618
Change 3332313 on 2017/03/03 by Rolando.Caloca
DR - Fix for integrated intel (properly)
PR #3305
#jira UE-42393
Change 3332317 on 2017/03/03 by Rolando.Caloca
DR - OpenGL SRV for index buffers (properly)
PR #3271
#jira UE-32618
Change 3332368 on 2017/03/03 by Rolando.Caloca
DR - Minor fixes so -sm4 and -sm5 can be used on windows with OpenGL/Vulkan
Change 3333690 on 2017/03/06 by Daniel.Wright
[Copy] Changing movable skylight properties no longer affects static draw lists
Change 3333693 on 2017/03/06 by Daniel.Wright
[Copy] Added 'r.AOListMeshDistanceFields' which dumps out mesh distance fields sorted by memory size, useful for directing content optimizations
Change 3333705 on 2017/03/06 by Daniel.Wright
[Copy] Mesh distance fields are now 8 bit fixed point by default, but can be changed back to 16 bit floating piont with a project setting.
* 8 bit uses half memory but introduces error for thin surfaces or large meshes.
Change 3333721 on 2017/03/06 by David.Hill
DecalProxy:
Copy float FadeScreenSize to FDeferredDecalProxy for use in the render thread. This avoids pointer chasing to the UDecalComponent (game thread component).
Change 3333772 on 2017/03/06 by Daniel.Wright
[Copy] Scene motion blur data is only updated for the main renderer frames. Fixes scene captures and planar reflections breaking object motion blur.
Change 3333790 on 2017/03/06 by Daniel.Wright
[Copy] Mesh distance field generation uses Embree, for a 2.5x speedup
* Can switch back to old kDOP generation with 'r.DistanceFieldBuild.UseEmbree 0' for debugging
Change 3333822 on 2017/03/06 by Daniel.Wright
[Copy] Moved mesh distance field code into MeshDistanceFieldUtilities.cpp
* Moved FMeshUtilities to its own header so the 8k line MeshUtilites.cpp file can be further split up
Change 3333827 on 2017/03/06 by Daniel.Wright
[Copy] Range compress 8bit distance fields - gets one extra bit of precision on average
Change 3333828 on 2017/03/06 by Daniel.Wright
[Copy] Raised High ShadowQuality to 2048 as 1024 for CSM is way too low
Change 3333831 on 2017/03/06 by Daniel.Wright
Non-editor compile fix
Change 3333836 on 2017/03/06 by Daniel.Wright
[Copy] Workaround for gobal distance field volume textures being bloated by 4x on PS4 due to the recommended tiling modes. They now use a 2d tiling mode which avoids the bloat, saving 96Mb.
Change 3333843 on 2017/03/06 by Daniel.Wright
[Copy] Added OcclusionExponent to skylight component
* Useful for brightening up indoors without losing contact shadows as MinOcclusion does
Change 3333845 on 2017/03/06 by Daniel.Wright
[Copy] Capsule shadow BP functions
Change 3333850 on 2017/03/06 by Daniel.Wright
[Copy] Added OcclusionCombineMode to skylight component
Change 3333854 on 2017/03/06 by Daniel.Wright
[Copy] Gnm properly registers clears as GPU work so those events show up in profilegpu
Change 3333857 on 2017/03/06 by Daniel.Wright
[Copy] Clear light attenuation for local lights with a quad covering their screen extents
* Clearing the entire light attenuation buffer costs .1ms on PS4. This optimization lowers the minimum cost of a shadow casting light from .15ms -> .03ms.
* Shadowed lights in Fortnite with 25 lights 3.7ms -> 1.42ms on PS4
Change 3333860 on 2017/03/06 by Daniel.Wright
[Copy] Flush deferred deletes when reallocating distance field atlas to reduce peak memory
Change 3333861 on 2017/03/06 by Daniel.Wright
[Copy] Disable all distance field features on Intel cards as HD 4000 hangs in the RHICreateTexture3D call to allocate the large atlas
Change 3333869 on 2017/03/06 by Daniel.Wright
[Copy] Volumetric Fog using a volume texture mapped to the camera frustum
* Volumetric fog can be enabled on an Exponential Height Fog component with additional controls
* Lights have a VolumetricScatteringIntensity
* New cvars r.VolumetricFog, r.VolumetricFog.GridPixelSize, r.VolumetricFog.GridSizeZ, r.VolumetricFog.DepthDistributionScale
* Lighting features supported:
* Directional light with CSM and a light function
* Point / spot lights without shadows / light functions / IES profiles
* Skylight with occlusion from distance fields
* Analytical height fog covers the view range past where the volumetric fog ends
* Temporal reprojection is used on the volumetric fog scattering and extinction to achieve stability
* Translucency integrates properly into volumetric fog
* Height fog StartDistance is not supported by volumetric fog and should be set to 0.
Change 3333894 on 2017/03/06 by Daniel.Wright
[Copy] Initialize GDummyVolumetricFogGlobalDataUniformBuffer outside of parallel rendering
Change 3333902 on 2017/03/06 by Daniel.Wright
[Copy] Better handling of volumetric fog enabled with distance of 0
Change 3333903 on 2017/03/06 by Daniel.Wright
[Copy] Fixed volumetric fog trying to render light functions for a point light
Change 3333908 on 2017/03/06 by Daniel.Wright
[Copy] Volumetric materials
* Added new material domain Volume, which can output Scattering, Absorption and Emissive. All properties are in world space densities.
* Particle systems using the Volume domain are voxelized based on their ParticlePosition and ParticleRadius
* Volumetric fog integration is now energy conservative - scattering is integrated against transmission over the depth of each slice.
* Added bOverrideLightColorsWithFogInscatteringColors to exponential height fog, which can be enabled to make Volumetric Fog match Height fog more closely
Change 3334134 on 2017/03/06 by Daniel.Wright
[Copy from Michael Trepka] Added Embree 2.14.0 and changed MeshUtilities to use it as this solves issues with Embree leaking TLS keys. UnrealLightmass is still using older Embree 2.7.0 until we can find time to properly test it with the new version. Also, invalidated distance field DDC to force it to rebuild with updated Embree.
Change 3334420 on 2017/03/06 by Daniel.Wright
Fixed RTDF shadows
Change 3335467 on 2017/03/07 by Benjamin.Hyder
Initial submission of QA-Decals map to EngineTest
Change 3335556 on 2017/03/07 by Daniel.Wright
Changed mesh distance field default format back to R16f
Change 3338020 on 2017/03/08 by Daniel.Wright
Disable volumetric fog in vertex shaders for feature levels which don't support it
Change 3339394 on 2017/03/09 by Chris.Bunner
Correctly handle material texture translation error edge case.
#jira UE-42579, UE-42670
Change 3339992 on 2017/03/09 by Daniel.Wright
Only compile volumetric fog shaders on supporting platforms
Change 3341858 on 2017/03/10 by Arne.Schober
Copying //UE4/Dev-Rendering-PSO to Dev-Rendering (//UE4/Dev-Rendering)
#RB Rolando.Caloca, Marcus.Wassmer, Daniel.Wright, Nick.Penwarden, Mark.Satterthwaite
Change 3342004 on 2017/03/10 by Arne.Schober
Copying //UE4/Dev-Rendering-PSO to Dev-Rendering (//UE4/Dev-Rendering)
Fix unity build
#RB Marcus.Wassmer
Change 3343307 on 2017/03/13 by Marcus.Wassmer
Update showflags when we are guaranteed it will happen in all possible ways to spawn the scenecapture. (drag into editor, PIE, -game, etc)
Change 3343732 on 2017/03/13 by Rolando.Caloca
DR - Vulkan compute pipeline & refactor
Change 3344846 on 2017/03/14 by Rolando.Caloca
DR - Android compile fixes
Change 3344883 on 2017/03/14 by Rolando.Caloca
DR - Add missing stencil load/store to PSO initializer
Change 3344985 on 2017/03/14 by Rolando.Caloca
DR - Made load/store actions uint8
Change 3345141 on 2017/03/14 by Rolando.Caloca
DR - vk - Rework render pass hash
Change 3345304 on 2017/03/14 by Benjamin.Hyder
Updating TM-Distancefields map to include TemplateFloor mesh
Change 3345387 on 2017/03/14 by Rolando.Caloca
DR - Add _RenderThread calls for Create*Shader so RHIs can choose not to stall when creating
Change 3345388 on 2017/03/14 by Rolando.Caloca
DR - Do not stall when creating shaders on Vulkan
Change 3345722 on 2017/03/14 by Chris.Bunner
PR #3357: MinimalAPI add to many material expressions (Contributed by DeanoC)
#jira UE-42752
Change 3345723 on 2017/03/14 by Chris.Bunner
Reduce log verbosity causing spamming during landscape editing.
#jira UE-42714
Change 3345725 on 2017/03/14 by Chris.Bunner
[Duplicate 3341860] Fixed material translation error with multiple connections from custom interpolator nodes.
Change 3345726 on 2017/03/14 by Chris.Bunner
Typo fixes.
Change 3345732 on 2017/03/14 by Rolando.Caloca
DR - Decouple vertex declaration off BSS
Change 3345746 on 2017/03/14 by Chris.Bunner
Added sign() intrinsic material graph node and delisted material function workaround.
Change 3346042 on 2017/03/14 by Chris.Bunner
Implement missing size query interface for FRenderTargetResources.
#jira UE-41672
Change 3346387 on 2017/03/14 by Daniel.Wright
[Copy] Added VolumetricScatteringIntensity to particle lights
Change 3346389 on 2017/03/14 by Daniel.Wright
[Copy] Clamp Volumetric material attributes to fp16 range to avoid INFs
Disable volumetric fog when the fog show flag is disabled
Change 3346392 on 2017/03/14 by Daniel.Wright
[Copy] Fixed skylight being much too bright on volumetric fog
Change 3346406 on 2017/03/14 by Daniel.Wright
[Copy] CSM resolution is now controlled by r.Shadow.MaxCSMResolution.
* Changed HighPC to use 1024 MaxShadowResolution (max for all non-CSM shadows), saves 60Mb in Fortnite
Change 3346412 on 2017/03/14 by Daniel.Wright
[Copy] TexCreate_ReduceMemoryWithTilingMode for translucency lighting 3d textures, saves 13Mb
Change 3346414 on 2017/03/14 by Daniel.Wright
[Copy] TexCreate_ReduceMemoryWithTilingMode for volumetric fog 3d textures, saves 13Mb
Change 3346415 on 2017/03/14 by Daniel.Wright
[Copy] Missing file from cl 3338451
Change 3346421 on 2017/03/14 by Daniel.Wright
[Copy] Fixed NaNs in volumetric fog due to rendering when height fog is disabled
* Volumetric fog converts NaNs to black now so they don't spread
Change 3346422 on 2017/03/14 by Daniel.Wright
[Copy] Fixed NaN in volumetric fog with low density values
Change 3346423 on 2017/03/14 by Daniel.Wright
[Copy] Changed default VolumetricFogScatteringDistribution to .2
Change 3346430 on 2017/03/14 by Daniel.Wright
[Copy] New translucent material option to compute fog per pixel instead of the default per vertex
Change 3346432 on 2017/03/14 by Daniel.Wright
[Copy] Moved Volumetric Fog parameters to view uniform buffer for translucency pass
Fixed lifetimes of temporary Volumetric Fog render targets
Change 3346526 on 2017/03/14 by Daniel.Wright
[Copy] Volumetric Fog supports point and spot light shadows
* These lights are injected separately so that per-light resources can be bound (shadow depth map, static shadow depth map)
* Forward lighting of local lights can be forced with 'r.VolumetricFog.InjectShadowedLightsSeparately 0'
* Shadowed lights come at a cost: 2.9ms for volumetric fog on 970 -> 4.2ms with shadowing
Change 3347053 on 2017/03/15 by Rolando.Caloca
DR - android compile fix
Change 3347384 on 2017/03/15 by Rolando.Caloca
DR - Fix merge issue
Change 3347643 on 2017/03/15 by Marcus.Wassmer
Fix some bugs with the 'disable stationary skylight ffor the project' feature.
Fixes lighting in Persona on Paragon.
Change 3347979 on 2017/03/15 by Rolando.Caloca
DR - Allow to automatically apply cached rendertargets to PSO initializer
Change 3348024 on 2017/03/15 by Rolando.Caloca
DR - Remove NullPS on Vulkan to avoid deadlock
Change 3348303 on 2017/03/15 by Rolando.Caloca
DR - Fix for debugging SCW with material SRT
Change 3348357 on 2017/03/15 by Marcus.Wassmer
Fix stencildither and a stencilref bug that was probably breaking decals sometimes.
Change 3348549 on 2017/03/15 by Marcus.Wassmer
Hopefully fix static analysis for potential nullptr access.
Change 3348614 on 2017/03/15 by Marcus.Wassmer
Duplicate some switch changes to fix crash on launch.
Change 3349369 on 2017/03/16 by Gil.Gribb
Fixed botched merge
Change 3349947 on 2017/03/16 by Rolando.Caloca
DR - Fix for mismatched primitive type
Change 3349956 on 2017/03/16 by Benjamin.Hyder
initial updates to TM-DistanceFields map
Change 3350151 on 2017/03/16 by Rolando.Caloca
DR - Fix UT compile issue
Change 3350155 on 2017/03/16 by Rolando.Caloca
DR - Catch mismatched primitive type on PSOs on D3D11
Change 3350192 on 2017/03/16 by Daniel.Wright
Fix for point light shadow depths rendering with wrong cull mode due to PSO refactor
Change 3350736 on 2017/03/16 by Daniel.Wright
Fixed formatting from merge
Change 3350881 on 2017/03/16 by Rolando.Caloca
DR - Fix texture arrays as UAVs on Metal
Change 3350927 on 2017/03/16 by Rolando.Caloca
DR - Fix warning
Change 3350935 on 2017/03/16 by Daniel.Wright
Fix for materials with non-Surface domains being skipped in mesh passes
Change 3351583 on 2017/03/17 by Marcus.Wassmer
Fix clang platforms
Change 3351917 on 2017/03/17 by Marcus.Wassmer
Fix linux compile
Change 3351973 on 2017/03/17 by Marcus.Wassmer
Fix mismatched rendertargetformat
Change 3352038 on 2017/03/17 by Daniel.Wright
Enabled GetAndOrCreateGraphicsPipelineState ensures in Development for testing
Change 3352110 on 2017/03/17 by Marcus.Wassmer
Fix missing RT PSO apply
Change 3352695 on 2017/03/17 by Arne.Schober
DR - Remove PSO Rendertarget check in DX12 Resolve with Shader.
#RB Rolando.Caloca
Change 3352960 on 2017/03/17 by Arne.Schober
DR - Fix some things that slipped trough the PSO merge
#RB none
Change 3353150 on 2017/03/18 by Rolando.Caloca
DR - compile fix
Change 3353205 on 2017/03/18 by Arne.Schober
DR - Fix Incremental Compile and PS4 runtime error where CMASK is not allowed for ThickTile Mode
#RB none
Change 3353207 on 2017/03/18 by Arne.Schober
DR - Fix Confusion
#RB none
Change 3355183 on 2017/03/20 by Nick.Bullard
Fixed up Content orginzation for Decals automation tests in EngineTest
Change 3355627 on 2017/03/20 by Arne.Schober
DR - [UE-43094] - removed ensure in comporiton graph as control of the clear color cannot be gurantueed.
Change 3356342 on 2017/03/21 by Marcus.Wassmer
Fix clang errors
Change 3356591 on 2017/03/21 by Arne.Schober
DR - Fix ensure message
#RB none
Change 3356873 on 2017/03/21 by Arne.Schober
DR - Fix comparission of undefined values in RendertargetApply Check
Change 3357261 on 2017/03/21 by Marcus.Wassmer
Fix LinuxEditor compile
Change 3357294 on 2017/03/21 by Marcus.Wassmer
Add missing SSE functions
Change 3357351 on 2017/03/21 by Frank.Fella
Fix win32 and linux compiler errors
Change 3357370 on 2017/03/21 by Arne.Schober
DR - disable ensure in test builds
#RB Marcus.Wassmer
[CL 3357449 by Marcus Wassmer in Main branch]
2017-03-21 17:46:52 -04:00
# include "RendererModule.h"
# include "ClearQuad.h"
2020-02-06 17:56:50 -05:00
# include "Engine/VolumeTexture.h"
2020-07-06 18:58:26 -04:00
# include "DynamicMeshBuilder.h"
# include "DynamicPrimitiveDrawing.h"
2020-09-08 17:44:06 -04:00
# include "Lumen/Lumen.h"
# include "GlobalDistanceFieldHeightfields.h"
2015-05-11 20:04:15 -04:00
2018-09-11 14:44:10 -04:00
DECLARE_GPU_STAT ( GlobalDistanceFieldUpdate ) ;
2015-05-11 20:04:15 -04:00
int32 GAOGlobalDistanceField = 1 ;
FAutoConsoleVariableRef CVarAOGlobalDistanceField (
Copying //UE4/Dev-Console to //UE4/Dev-Main (Source: //UE4/Dev-Console @ 3483086)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3389969 on 2017/04/12 by Guillaume.Abadie
Implements FDebug::DumpStackTraceToLog() debugging utility.
Change 3391579 on 2017/04/12 by Joe.Barnes
Fix minor spacing issue.
Change 3402629 on 2017/04/20 by Ben.Marsh
Build: Remove job to populate the DDC. Trying out a new system for this.
Change 3417501 on 2017/05/01 by Joe.Barnes
IWYU - Missing #include files.
Change 3419927 on 2017/05/02 by Joe.Barnes
- Support custom LOD for shadow map generation only (r.ForceLODShadow)
- New #define to expose forceLOD and forceLODShadow also in Test/Ship builds (#define EXPOSE_FORCE_LOD 1 in RenderCore.cpp). Not exposed by default.
Change 3420964 on 2017/05/03 by Jonathan.Fitzpatrick
Fixed null dereference of LineBatcher when using DrawDebugSphere and DrawDebugAltCone
#jira UE-30213
Change 3423470 on 2017/05/04 by Luke.Thatcher
[CONSOLE] [STREAMS] [^] Merging //UE4/Dev-Main (CL 3391974) to Dev-Console (//UE4/Dev-Console)
- Compile errors in Switch, to be fixed after check-in.
Change 3430410 on 2017/05/09 by Ben.Woodhouse
Fix uninitialized local variable causing crashes in Test
#jira UE-44832
Change 3430506 on 2017/05/09 by Josh.Adams
- Fixed up the editor platforms' method of loading TargetSettings objects so that we don't need any manual parsing of .ini files to fill out the class defaults
Change 3434035 on 2017/05/10 by Ben.Woodhouse
Integrate updated FortGPUTestbed from Fortnite/Main
Change 3437046 on 2017/05/12 by Joe.Barnes
Fix for clang producing a warning when not all specializations of a templated function are marked FORCEINLINE.
Also, switch a specialization of BlendTransform() from a function with a check to just a declaration so compiler will catch error instead of a runtime catch.
Change 3437259 on 2017/05/12 by Joe.Barnes
Fix for clang producing a warning when not all specializations of a templated function are marked FORCEINLINE.
Also, switch a specialization of BlendTransform() from a function with a check to just a declaration so compiler will catch error instead of a runtime catch.
Change 3440758 on 2017/05/16 by Ben.Woodhouse
Simple threaded CSV Profiler
To capture:
- On the commandline, add -csvCaptureFrames=N to capture N frames from startup
- On the console, use csvprofile start, csvprofile stop or csvprofile frames=N to capture a fixed number of frames
- Instrument with CSV_SCOPED_STAT(statname), CSV_CUSTOM_STAT(statname,value).
CSV capture is enabled in all builds except shipping
- Please do not check in the instrumentation √ we don╞t want to pollute the engine with lots of additional instrumentation. We may add some minimal level of instrumentation at some point
Change 3440954 on 2017/05/16 by Josh.Adams
- Cleaned up some DeviceProfiles in BaseDP.ini
Change 3443778 on 2017/05/17 by Ben.Woodhouse
Aliasing for transient resources + new high level API
Changelists integrated:
3368830
3368887
3377762
3377763
3379513
3381840
3382046
3382138
3385390
3385391
3385531
3396613
3388752
3396756
3397007
3397059
3397780
3397883
3401716
3415179
Change 3451460 on 2017/05/22 by Ben.Woodhouse
Fix editor crash (NULL dereference of ScreenSpaceShadowTexture) when moving the camera around in tm-shadermodels, probably fallout from the VRAM aliasing merge. Not sure if this is the correct fix, but it prevents the crash for now
Change 3451601 on 2017/05/22 by Josh.Adams
- Track idle time from MaxTickRate, so that stat unit is accurate on Game: thread
Change 3452025 on 2017/05/22 by Ben.Woodhouse
Integrate (as edit) CL 3378734 (editor crash fix)
Also add a check for null in LightFunctionRendering.cpp
Change 3452389 on 2017/05/22 by Josh.Adams
- Replaced POCulturePluralForms with a static array, instead of TMapBuilder (was blowing stack or similar on Switch Debug). Code courtesy of Jamie Dale.
Change 3452758 on 2017/05/22 by Joe.Barnes
Add FindFirstClearBit() and FindFirstSetBit() to TStaticBitArray.
Change 3455889 on 2017/05/23 by Ben.Woodhouse
Integrate from //UE4/Main/...@3453436 to //UE4/Dev-Console/...
Change 3458654 on 2017/05/25 by Joe.Conley
Attempting to fix Static Analysis warning.
Change 3462710 on 2017/05/26 by Ben.Woodhouse
Integrate from //UE4/Main/...@3461688 to //UE4/Dev-Console/...
Change 3471711 on 2017/06/02 by Jonathan.Fitzpatrick
Updating MallocProfiler to use the accessor for OnOutOfMemory delegate to conform with change made in CL 3415996.
Change 3473813 on 2017/06/05 by Ben.Woodhouse
Fix streaming visibility logic bug reported on UDN
#jira UE-43892
Change 3475298 on 2017/06/06 by Luke.Thatcher
[CONSOLE] [!] Fix RHITransitionResources crash with more than 16 textures.
- Old command had a fixed sized array of 16 textures that would overflow.
- New command allocates an array of texture pointers inline in the command list, so any number is supported.
#jira UE-45625
Change 3476776 on 2017/06/06 by Ben.Woodhouse
Integrate from //UE4/Main/...@3475908
Change 3479083 on 2017/06/07 by Ben.Woodhouse
Integrate as edit CL 3467315 from dev-animphys:
From Alexis Matte.
Ensure SectionMap is fixed up for old entries that are no longer valid.
#JIRA UE-45438
#jira UE-45735
Change 3480576 on 2017/06/08 by Ben.Woodhouse
Integrate from //UE4/Main/...@3480024 to //UE4/Dev-Console/...
[CL 3483258 by Luke Thatcher in Main branch]
2017-06-09 17:44:13 -04:00
TEXT ( " r.AOGlobalDistanceField " ) ,
2015-05-11 20:04:15 -04:00
GAOGlobalDistanceField ,
TEXT ( " Whether to use a global distance field to optimize occlusion cone traces. \n " )
TEXT ( " The global distance field is created by compositing object distance fields into clipmaps as the viewer moves through the level. " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
2015-05-11 20:04:15 -04:00
) ;
2022-04-22 19:55:41 -04:00
float GGlobalDistanceFieldOccupancyRatio = 0.4f ;
2020-09-08 17:44:06 -04:00
FAutoConsoleVariableRef CVarAOGlobalDistanceFieldOccupancyRatio (
TEXT ( " r.AOGlobalDistanceField.OccupancyRatio " ) ,
GGlobalDistanceFieldOccupancyRatio ,
TEXT ( " Expected sparse global distacne field occupancy for the page atlas allocation. 0.25 means 25% - filled and 75% - empty. " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
2020-09-08 17:44:06 -04:00
) ;
2020-07-06 18:58:26 -04:00
int32 GAOGlobalDistanceFieldNumClipmaps = 4 ;
FAutoConsoleVariableRef CVarAOGlobalDistanceFieldNumClipmaps (
TEXT ( " r.AOGlobalDistanceField.NumClipmaps " ) ,
GAOGlobalDistanceFieldNumClipmaps ,
TEXT ( " Num clipmaps in the global distance field. Setting this to anything other than 4 is currently only supported by Lumen. " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
2020-07-06 18:58:26 -04:00
) ;
2022-01-21 23:23:16 -05:00
int32 GAOGlobalDistanceFieldHeightfield = 1 ;
FAutoConsoleVariableRef CVarAOGlobalDistanceFieldHeightfield (
TEXT ( " r.AOGlobalDistanceField.Heightfield " ) ,
GAOGlobalDistanceFieldHeightfield ,
TEXT ( " Whether to voxelize Heightfield into the global distance field. \n " ) ,
ECVF_Scalability | ECVF_RenderThreadSafe
) ;
2015-05-11 20:04:15 -04:00
int32 GAOUpdateGlobalDistanceField = 1 ;
FAutoConsoleVariableRef CVarAOUpdateGlobalDistanceField (
TEXT ( " r.AOUpdateGlobalDistanceField " ) ,
GAOUpdateGlobalDistanceField ,
TEXT ( " Whether to update the global distance field, useful for debugging. " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
2015-05-11 20:04:15 -04:00
) ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
int32 GAOGlobalDistanceFieldCacheMostlyStaticSeparately = 1 ;
FAutoConsoleVariableRef CVarAOGlobalDistanceFieldCacheMostlyStaticSeparately (
TEXT ( " r.AOGlobalDistanceFieldCacheMostlyStaticSeparately " ) ,
GAOGlobalDistanceFieldCacheMostlyStaticSeparately ,
TEXT ( " Whether to cache mostly static primitives separately from movable primitives, which reduces global DF update cost when a movable primitive is modified. Adds another 12Mb of volume textures. " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
) ;
2015-05-11 20:04:15 -04:00
int32 GAOGlobalDistanceFieldPartialUpdates = 1 ;
FAutoConsoleVariableRef CVarAOGlobalDistanceFieldPartialUpdates (
TEXT ( " r.AOGlobalDistanceFieldPartialUpdates " ) ,
GAOGlobalDistanceFieldPartialUpdates ,
TEXT ( " Whether to allow partial updates of the global distance field. When profiling it's useful to disable this and get the worst case composition time that happens on camera cuts. " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
2015-05-11 20:04:15 -04:00
) ;
2016-04-04 18:44:59 -04:00
int32 GAOGlobalDistanceFieldStaggeredUpdates = 1 ;
FAutoConsoleVariableRef CVarAOGlobalDistanceFieldStaggeredUpdatess (
TEXT ( " r.AOGlobalDistanceFieldStaggeredUpdates " ) ,
GAOGlobalDistanceFieldStaggeredUpdates ,
TEXT ( " Whether to allow the larger clipmaps to be updated less frequently. " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
2016-04-04 18:44:59 -04:00
) ;
2020-07-06 18:58:26 -04:00
int32 GAOGlobalDistanceFieldClipmapUpdatesPerFrame = 2 ;
FAutoConsoleVariableRef CVarAOGlobalDistanceFieldClipmapUpdatesPerFrame (
TEXT ( " r.AOGlobalDistanceFieldClipmapUpdatesPerFrame " ) ,
GAOGlobalDistanceFieldClipmapUpdatesPerFrame ,
TEXT ( " How many clipmaps to update each frame, only 1 or 2 supported. With values less than 2, the first clipmap is only updated every other frame, which can cause incorrect self occlusion during movement. " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
2020-07-06 18:58:26 -04:00
) ;
2018-09-11 14:44:10 -04:00
int32 GAOGlobalDistanceFieldForceFullUpdate = 0 ;
FAutoConsoleVariableRef CVarAOGlobalDistanceFieldForceFullUpdate (
TEXT ( " r.AOGlobalDistanceFieldForceFullUpdate " ) ,
GAOGlobalDistanceFieldForceFullUpdate ,
TEXT ( " Whether to force full global distance field update every frame. " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
2018-09-11 14:44:10 -04:00
) ;
2020-07-06 18:58:26 -04:00
int32 GAOGlobalDistanceFieldForceMovementUpdate = 0 ;
FAutoConsoleVariableRef CVarAOGlobalDistanceFieldForceMovementUpdate (
TEXT ( " r.AOGlobalDistanceFieldForceMovementUpdate " ) ,
GAOGlobalDistanceFieldForceMovementUpdate ,
TEXT ( " Whether to force N texel border on X, Y and Z update each frame. " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
2020-07-06 18:58:26 -04:00
) ;
2015-12-10 21:55:37 -05:00
int32 GAOLogGlobalDistanceFieldModifiedPrimitives = 0 ;
FAutoConsoleVariableRef CVarAOLogGlobalDistanceFieldModifiedPrimitives (
TEXT ( " r.AOGlobalDistanceFieldLogModifiedPrimitives " ) ,
GAOLogGlobalDistanceFieldModifiedPrimitives ,
TEXT ( " Whether to log primitive modifications (add, remove, updatetransform) that caused an update of the global distance field. \n " )
2020-07-06 18:58:26 -04:00
TEXT ( " This can be useful for tracking down why updating the global distance field is always costing a lot, since it should be mostly cached. \n " )
TEXT ( " Pass 2 to log only non movable object updates. " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
2020-07-06 18:58:26 -04:00
) ;
int32 GAODrawGlobalDistanceFieldModifiedPrimitives = 0 ;
FAutoConsoleVariableRef CVarAODrawGlobalDistanceFieldModifiedPrimitives (
TEXT ( " r.AOGlobalDistanceFieldDrawModifiedPrimitives " ) ,
GAODrawGlobalDistanceFieldModifiedPrimitives ,
2021-08-25 17:05:27 -04:00
TEXT ( " Whether to draw primitive modifications (add, remove, updatetransform) that caused an update of the global distance field. \n " )
2015-12-10 21:55:37 -05:00
TEXT ( " This can be useful for tracking down why updating the global distance field is always costing a lot, since it should be mostly cached. " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
2015-12-10 21:55:37 -05:00
) ;
2015-05-11 20:04:15 -04:00
float GAOGlobalDFClipmapDistanceExponent = 2 ;
FAutoConsoleVariableRef CVarAOGlobalDFClipmapDistanceExponent (
TEXT ( " r.AOGlobalDFClipmapDistanceExponent " ) ,
GAOGlobalDFClipmapDistanceExponent ,
TEXT ( " Exponent used to derive each clipmap's size, together with r.AOInnerGlobalDFClipmapDistance. " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
2015-05-11 20:04:15 -04:00
) ;
int32 GAOGlobalDFResolution = 128 ;
FAutoConsoleVariableRef CVarAOGlobalDFResolution (
TEXT ( " r.AOGlobalDFResolution " ) ,
GAOGlobalDFResolution ,
TEXT ( " Resolution of the global distance field. Higher values increase fidelity but also increase memory and composition cost. " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
2015-05-11 20:04:15 -04:00
) ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3249742)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3232283 on 2016/12/13 by Ben.Woodhouse
D3D12 - downgrade root signature size warning to a log following a discussion with Microsoft. There's not much we can actually do about it, and it's not relevant to all hardware
#jira UE-36999
Change 3232641 on 2016/12/13 by Mark.Satterthwaite
- Eliminate redundant state changes in MetalRHI in the state cache.
- Add a new debug level for setting buffers to nil prior to calls to set*Bytes so that the tool doesn't display incorrect data.
- Make testing for validation & statistics features use the same EMetalFeatures API as everything else for consistency.
- Cache the fallback depth-stencil texture in the state cache and ignore it for determining whether a pass can restart - if we are using this texture its contents are worthless anyway.
Change 3232661 on 2016/12/13 by Mark.Satterthwaite
Re-enable Metal SM5 & DFAO/DistanceFieldShadowing on Intel for 10.12.2 or later.
Change 3232759 on 2016/12/13 by Ben.Woodhouse
Fix memory leak on XB1 when calling GPURealloc with count of 0, suggested on UDN
https://udn.unrealengine.com/questions/326660/gpurealloc-leak.html
Change 3232803 on 2016/12/13 by Ben.Marsh
Add UT to the populate DDC job, and cook UT and Fortnite for Mac as well.
Change 3232836 on 2016/12/13 by Ben.Marsh
Split cooks to populate DDC into separate nodes for each platform. May help to reduce number of timeouts on remote VMs.
Change 3232974 on 2016/12/13 by Rolando.Caloca
DR - Refactor common code to UWorld::RecreateScene
#jira UE-36719
PR #2824
Change 3232976 on 2016/12/13 by Ben.Marsh
Add missing dependency on tools node for Mac cooks. Need to compile SCW first.
Change 3233289 on 2016/12/13 by Olaf.Piesche
Fixing potentially broken spot/point light fade with old content; initialize new properties properly
Change 3233811 on 2016/12/13 by Mark.Satterthwaite
Fix compiling QA-Material tessellation shaders that don't need to emit from Hull or sample in Domain the HSOut buffer which was confusing MetalBackend.
Change 3233854 on 2016/12/13 by Mark.Satterthwaite
More information about texture type validation errors in Metal.
Change 3234650 on 2016/12/14 by Rolando.Caloca
DR - vk - Fix bad aspect on depth cubemaps
Change 3234651 on 2016/12/14 by Rolando.Caloca
DR - vk - Fix for 32 bit crash on dump layer
Change 3234813 on 2016/12/14 by Guillaume.Abadie
Fixes texture mask static lighting when using GBuffer selective outputs.
#jira UE-39527
Change 3235047 on 2016/12/14 by Uriel.Doyon
Refactored HLOD texture streaming strategy to separate forced load from visibility.
Added an incremental update in the last stage of the texture streaming update load to clear any pending work.
Added an option "All" to the "BuildMateriaTexturelStreamingData" command to force rebuild everything.
Change 3235317 on 2016/12/14 by Uriel.Doyon
Removed timed primitives in the texture streaming since it was not used and there is now a fallback implementation in UPrimitiveComponent::GetStreamingTextureInfo.
Change 3235431 on 2016/12/14 by Rolando.Caloca
DR - Fix for Vulkan drawing black
Change 3236788 on 2016/12/15 by Mark.Satterthwaite
Fix 10.11.6 support (aka -nometalv2): the stencil view workaround necessitates a mid-render blit and the way things were setup resulted in the HasValidRenderTargets assert firing. Refactored the code to separate the concept or valid render-states in the cache from active render-states in the render-pass. Now it works as intended and will be needed for 4.15.
Change 3236850 on 2016/12/15 by Mark.Satterthwaite
Make changing the Metal Shader Version project setting prompt the user to restart for the changes to take effect.
#jira UE-39801
Change 3237002 on 2016/12/15 by Benjamin.Hyder
submitting updated TM-Shadermodels map
Change 3237312 on 2016/12/15 by Rolando.Caloca
DR - Change more macros to lambdas
Change 3237394 on 2016/12/15 by Mark.Satterthwaite
Add Metal-specific permutations of TBasePassHS - they affect the C++ definition on all platforms but are only cached or used on Metal - because the way we compile the combined VS+HS tessellation stage requires that the combined VS + HS HLSL code references the same resources, otherwise we get incorrect resouce bindings and subsequently fail to render properly. Long-term the Metal tessellation code will need to be refactored so that the vertex shader stage is emitted as a separate shader from the hull shader stage as this but will keep cropping back up and continue to complicate the engine.
#jira UE-39799
Change 3237490 on 2016/12/15 by Daniel.Wright
Fixed ULandscapeComponent::GetUsedMaterials
Change 3237597 on 2016/12/15 by Ben.Woodhouse
Disable timestamp queries on pre-Maxwell nvidia hardware. Local testing suggests that this is the major cause of instability in the UE4.14 release.
It's possible that we could be more targeted by only excluding Fermi and older hardware, but identifying fermi hardware by device ID is difficult in practice, since the range overlaps with Kepler.
Change 3237654 on 2016/12/15 by Daniel.Wright
Non-editor compile fix
Change 3238229 on 2016/12/16 by Rolando.Caloca
DR - Remove ExcludeRect from inner RHI Clear methods; ensure will happen if trying to use it
Change 3238236 on 2016/12/16 by Rolando.Caloca
DR - Compile fixes
Change 3238280 on 2016/12/16 by Marc.Olano
Small optimization to Lanczos-3 upsample shader code.
Change 3238321 on 2016/12/16 by Rolando.Caloca
DR - Compile fix
Change 3238331 on 2016/12/16 by Rolando.Caloca
DR - compile fix
Change 3238495 on 2016/12/16 by Marc.Olano
Replace TEA random number generator with PCG.
Was only used in #if-disabled reference rendering, but ldoes make better quality reference rendering when enabled.
Change 3238496 on 2016/12/16 by Marc.Olano
Tone mapping fix for OR-31752, cherry picked from Orion 3208273
Assumption that green is approximates luminance fails on red/blue HDR content, resulting in ugly black artifacts. Go back to luminance.
Change 3238520 on 2016/12/16 by Rolando.Caloca
DR - CIS Fix
Change 3238571 on 2016/12/16 by Rolando.Caloca
DR - CIS fix
Change 3238605 on 2016/12/16 by Daniel.Wright
Sharing IndirectLightingCacheTextureSampler samplers
Change 3238626 on 2016/12/16 by Daniel.Wright
Ray Traced Distance Field Shadow optimizations
* Tighter light space tile culling
* Skip ray marching pixels before the RTDF cascade near distance, or further than the cascade far distance
* Depth bounds test on upsample
* Created FLightTileIntersectionParameters for encapsulation of light tile culling functionality
* RTDF shadow time went from 1.8ms -> .8ms and 3.1ms -> 1.2ms in FortGPUTestbed on 7870 with these changes
Change 3238652 on 2016/12/16 by Rolando.Caloca
DR - RHI clear methods no longer have an ExcludeRect, use DrawClearQuad functions instead
Change 3238855 on 2016/12/16 by Rolando.Caloca
DR - Added FRHITexture2D GetSizeXY
Change 3238881 on 2016/12/16 by Rolando.Caloca
DR - CIS fix
Change 3239008 on 2016/12/16 by Arne.Schober
DR - Fixing accidently returning a stackpointer in EnqueueRenderCommands
Change 3239012 on 2016/12/16 by Arne.Schober
DR - missing file
Change 3239255 on 2016/12/17 by Rolando.Caloca
DR - Remove shader clears from D3D11
Change 3239690 on 2016/12/19 by Rolando.Caloca
DR - vk - Misc fixes from 1.0.37.00 SDK warnings
Change 3239964 on 2016/12/19 by Rolando.Caloca
DR - Fix click on editor not showing selected
Change 3239995 on 2016/12/19 by Rolando.Caloca
DR - Enable dist field on GL4 & Vulkan SM5
Change 3240162 on 2016/12/19 by Daniel.Wright
Added EnableDepthBoundsTest / DisableDepthBoundsTest to RHIUtilites to share some common code
Change 3240163 on 2016/12/19 by Daniel.Wright
Distance field self shadowing controls for hiding world position offset self-shadow artifacts
* Removed static mesh build settings DistanceFieldBias, which shrunk the distance field, breaking AO and shadows
* Added DistanceFieldSelfShadowBias, which prevents occlusion close to the surface only, maintaining shadows on the ground and AO on the ground
Change 3240271 on 2016/12/19 by Daniel.Wright
Use 16 bit indices for distance field objects culled to tiles, when 16 bit will be enough. Saves 10mb of tile culling buffers.
Change 3240282 on 2016/12/19 by Rolando.Caloca
DR - Proper fix for hit proxies clear
- Added missing stencil ref to DrawClearQuad
Change 3240316 on 2016/12/19 by Rolando.Caloca
DR - vk - Fixed some new 1.0.37.0 warnings
Change 3240354 on 2016/12/19 by Rolando.Caloca
DR - Dev shaders on sm4/5
Change 3240759 on 2016/12/20 by Rolando.Caloca
DR - Fix bad crc on GL element declarations
Change 3240895 on 2016/12/20 by Rolando.Caloca
DR - vk - Swapchain fixes
Change 3241057 on 2016/12/20 by Rolando.Caloca
DR - vk - Fix resize on desktop
Change 3241112 on 2016/12/20 by Rolando.Caloca
DR - vk - Fix 1.0.37.0 warnings
- Ignore some warnings we know we can't fix
Change 3241310 on 2016/12/20 by Rolando.Caloca
DR - vk - Fix crash
Change 3241417 on 2016/12/20 by Daniel.Wright
[Copy] Fixed race condition with FPrecomputedLightVolume::Data which was exposed when switching lighting scenarios
Change 3241990 on 2016/12/21 by Daniel.Wright
Converted DistanceFieldVolume data to BulkData
* FDistanceFieldVolumeData Serialize time from .7s on PS4 to 0s
Change 3242005 on 2016/12/21 by Daniel.Wright
Removed unused !USE_DEPTH_RANGE_LISTS path to reduce complexity
Change 3242295 on 2016/12/21 by Bob.Tellez
Duplicating CL#3242294 from //Fortnite/Main
#UE4 Re-applying the fix for rendering editor primitives when r.EarlyZPassOnlyMaterialMasking is enabled
Change 3242487 on 2016/12/21 by Marcus.Wassmer
Fix typo
Change 3243091 on 2016/12/22 by Daniel.Wright
Fixed too many groups dispatched for TConeTraceScreenGridGlobalOcclusionCS
Change 3243161 on 2016/12/22 by Uriel.Doyon
New async tasks for the streaming update. Optimizing the biggest frame cost.
Change 3243179 on 2016/12/22 by Uriel.Doyon
Fixed possible invalid access from the async FNormalizeLightmapTexelFactorTask
Change 3243236 on 2016/12/22 by Daniel.Wright
Fixed DFAO bilateral upsample
* Depth buffer was being unbound due to lack of DepthRead_StencilNop
Change 3243452 on 2016/12/23 by Ben.Woodhouse
Bring back 1024 render query limit workaround on D3D12 which was lost during the merge from partners
#jira UE-35247
Change 3243512 on 2016/12/23 by Uriel.Doyon
Improved task system for texture streaming.
Change 3243742 on 2016/12/26 by Rolando.Caloca
DR - vk - Fix UAV clears
- Removed old validation layer
- Print found device layers
Change 3243745 on 2016/12/27 by Rolando.Caloca
DR - vk - Fix for texture cube arrays
- Warning for ClearUAVs
Change 3243762 on 2016/12/27 by Rolando.Caloca
DR - vk - Always use pipeline cache
Change 3244450 on 2016/12/31 by Rolando.Caloca
DR - vk - Pre reqs for separate transfer queue
Change 3244453 on 2016/12/31 by Rolando.Caloca
DR - vk - Win32 compile fix
Change 3244756 on 2017/01/03 by Marcus.Wassmer
Copying //Tasks/UE4/Dev-Niagara@3244743 to Dev-Rendering (//UE4/Dev-Rendering)
Change 3244757 on 2017/01/03 by Marcus.Wassmer
Niagara is still experimental in non-task branches.
Change 3245059 on 2017/01/03 by Benjamin.Hyder
Submitting TM-TrigNodes map
Change 3245500 on 2017/01/03 by Olaf.Piesche
Compile fix #1 for post-merge problems
Change 3245572 on 2017/01/03 by Olaf.Piesche
(Speculative) fix #2 for post-merge build problem. Hopefully fixes public distribution level error for cross compiler tool.
Change 3245683 on 2017/01/03 by Marcus.Wassmer
Fix some niagara warnings
Change 3245732 on 2017/01/03 by Marcus.Wassmer
Fix Niagara compile on clang platforms.
Fix a few warnings / static analysis things as well.
Change 3246403 on 2017/01/04 by Rolando.Caloca
DR - vk - Fix bogus warning
Change 3246432 on 2017/01/04 by Marcus.Wassmer
Copying //Tasks/UE4/Dev-Niagara@3246424 to Dev-Rendering (//UE4/Dev-Rendering)
Change 3246538 on 2017/01/04 by Rolando.Caloca
DR - vk - Show hitch time for compute psos
Change 3246580 on 2017/01/04 by Rolando.Caloca
DR - vk - compile fix
Change 3246610 on 2017/01/04 by Rolando.Caloca
DR - Compute PSO pre reqs
Change 3246707 on 2017/01/04 by Marcus.Wassmer
Add missing integer operations to UnrealMathDirectX.h
Change 3246786 on 2017/01/04 by Marcus.Wassmer
Avoid public dependency build errors. Should probably just remove the DDCUtils module instead
Change 3246828 on 2017/01/04 by Olaf.Piesche
UE-39249; need to check the view as well as the view family in CheckAndUpdateLastFrame; scene captures use a different family, but each eye for VR uses a different scene view.
Change 3247026 on 2017/01/04 by Rolando.Caloca
DR - Remove CrossCompilerTool as it's not required anymore
Change 3247086 on 2017/01/04 by Marcus.Wassmer
Remove includes for Core.h monolithic header
Change 3247227 on 2017/01/04 by Marcus.Wassmer
Fix typo and compile errors.
Change 3247228 on 2017/01/04 by Marcus.Wassmer
Use crossplatform intrinsics
Change 3247229 on 2017/01/04 by Marcus.Wassmer
Implement missing integer NEON operations.
Change NEON vectorint to match name and sign from other platforms
Change 3247245 on 2017/01/04 by Marcus.Wassmer
Fixing various warnings/errors from clang platforms (Mac/Linux)
Change 3247331 on 2017/01/04 by Marcus.Wassmer
More Mac/clang fixes
Change 3247958 on 2017/01/05 by Marcus.Wassmer
VectorInt < - > Float ops should be conversions not reinterpret cast
Change 3247959 on 2017/01/05 by Marcus.Wassmer
Add missing ops to non-vector header
Change 3247964 on 2017/01/05 by Rolando.Caloca
DR - Temp fix for crash
#jira UE-40211
Change 3248067 on 2017/01/05 by Rolando.Caloca
DR - Static analysis fixes
#jira UE-40167
Change 3248284 on 2017/01/05 by Rolando.Caloca
DR - Linuix Compile fix
#jira UE-40260
Change 3248288 on 2017/01/05 by Rolando.Caloca
DR - Linux compile fix
#jira UE-40264
Change 3248399 on 2017/01/05 by Brian.Karis
Filtered importance sampling for envmap prefiltering.
Fixed SSR on clearcoat with skylight only.
Change 3248503 on 2017/01/05 by Rolando.Caloca
DR - Linux fixes
#jira UE-40264
Change 3248666 on 2017/01/05 by Brian.Karis
Fix GL compile error
Change 3248740 on 2017/01/05 by Marcus.Wassmer
Fix linux and clang errors/warnings
Change 3248851 on 2017/01/05 by Marcus.Wassmer
Simplest fix for ES2 compile errors
Change 3249217 on 2017/01/06 by Simon.Tovey
Speculative fix for static analysis warning
Change 3249296 on 2017/01/06 by Ben.Woodhouse
XB1/Fast semantics:
Add missing L1/L2 cache flush on transition to readable (or RW). The missing cache flush was causing indeterminism when reading from a texture shortly after writing to it as a render target.
This fixes bloom and diffuse irradiance issues
The bug has been there for a while, but CL 3227787 (drawclear early out) caused it to manifest
#jira UE-39727
#jira UE-40238
Change 3249300 on 2017/01/06 by Ben.Woodhouse
Remove workaround for diffuse irradiance (redundant clear). No longer necessary with CL 3249296
Change 3249387 on 2017/01/06 by Rolando.Caloca
DR - Fix GL clear issues
#jira UE-40254
Change 3249435 on 2017/01/06 by Ben.Woodhouse
Duplicated from UT CL 3238664
Fix dbuffer decal rendering issues in fullscreen on PC. Also fixes crash in editor when viewing dbuffer materials.
Pass clearcolor in RT params for system textures to workaround a bug with ClearColorTexture not working in fullscreen mode on DX11. Make sure dbuffer targets are bound if we're rendering mesh decals
#jira UT-6891
#jira UE-39842
Change 3249721 on 2017/01/06 by Marcus.Wassmer
Remove final references to non-existent Niagara data
Change 3249742 on 2017/01/06 by Marcus.Wassmer
Fix missing GPU particles on Mac.
Pointers getting reused is causing the blendstate equality operator to fail.
Simple workaround until we have time for a proper fix.
[CL 3249983 by Marcus Wassmer in Main branch]
2017-01-06 17:51:46 -05:00
float GAOGlobalDFStartDistance = 100 ;
2015-05-11 20:04:15 -04:00
FAutoConsoleVariableRef CVarAOGlobalDFStartDistance (
TEXT ( " r.AOGlobalDFStartDistance " ) ,
GAOGlobalDFStartDistance ,
TEXT ( " World space distance along a cone trace to switch to using the global distance field instead of the object distance fields. \n " )
TEXT ( " This has to be large enough to hide the low res nature of the global distance field, but smaller values result in faster cone tracing. " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
2015-05-11 20:04:15 -04:00
) ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3185985)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3170391 on 2016/10/21 by Ben.Woodhouse
Remove the wait on end of frame ensure, because we can't rely on all the the underlying codepaths to never miss a call to flush RHI resources. The consequences of missing a flush on a given frame are not serious now, since we enforce the synchronisation with a fence, preventing the rendering thread from getting too far ahead. We will simply accumulate resources for an additional frame when this happens.
#jira UE-37437
#fyi rolando.caloca, marcus.wassmer
Change 3170659 on 2016/10/21 by Rolando.Caloca
DR - vk - Prep work for state key changes
Change 3170676 on 2016/10/21 by Rolando.Caloca
DR - vk - Reworked blend state keys
- Added depth/stencil to pipeline key
Change 3170848 on 2016/10/21 by Daniel.Wright
Level viewport 'show stats' option is now enabled by default, which avoids confusion with artists thinking lighting is built, when really the message is hidden.
Change 3170849 on 2016/10/21 by Daniel.Wright
Split FProjectedShadowInfo::RenderProjection into smaller functions which make the algorithm structure clear
Change 3170995 on 2016/10/21 by Rolando.Caloca
DR - vk - Show object on vulkan validation msgs
Change 3171085 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix pipelines being used with incompatible renderpasses
Change 3171159 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix layout when reading textures on CPU
Change 3171167 on 2016/10/21 by Rolando.Caloca
DR - vk - compile fix
Change 3172462 on 2016/10/24 by Daniel.Wright
Added a warning about shader compile times to the material tooltip
Change 3172463 on 2016/10/24 by Daniel.Wright
Reduced MinUnoccludedFraction to avoid artitfacts when a stationary light touches only a tiny part of a mesh
Change 3172716 on 2016/10/24 by Brian.Karis
Fix for crash UE-37369 when reimporting over a generated LOD.
Change 3172967 on 2016/10/24 by Rolando.Caloca
DR - vk - Fix writing buffers while GPU was using them
Change 3174187 on 2016/10/25 by Olaf.Piesche
UE-37020
Change 3174718 on 2016/10/26 by Rolando.Caloca
DR - vk - Remove old timestamp queries, increase occlusion queries per pool to 4k
Change 3175960 on 2016/10/26 by Rolando.Caloca
DR - Added support for hlslcc header to have custom parsing
Change 3176611 on 2016/10/27 by David.Hill
DrawWireCone confusion:
In response to a UDN, I'm updating confusing parameter names and comments for
DrawWireCone() and DrawWireSphereCappedCone()
Change 3177111 on 2016/10/27 by Rolando.Caloca
DR - vk - Fix timestamps for frame
Change 3177192 on 2016/10/27 by Arne.Schober
DR - DitherLOD refactor - moved computation of the DepthStencil state out of SetMeshRenderState into GetDitheredLODTransitionState this is a prerequisite of further PSO work where we want to move up State setting in a similar war and reuse FMeshDrawingRenderState
Change 3177278 on 2016/10/27 by Olaf.Piesche
UE-37484
Change 3177297 on 2016/10/27 by Rolando.Caloca
DR - vk - Enable GRHISupportsBaseVertexIndex
Change 3177607 on 2016/10/27 by Rolando.Caloca
DR - vk - SM4 UB prep
Change 3178052 on 2016/10/28 by Arne.Schober
DR - fix WebGL - the WebGL compiler is very picky on double underscores and does want the presission to be defined before any function definition.
Change 3178156 on 2016/10/28 by Rolando.Caloca
DR - vk - Added query timer
- Fixed inline issues
Change 3178158 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for out of stencil bits
Change 3178462 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for Elemental
Change 3179131 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix for r.Vulkan.UseRealUBs
Change 3179139 on 2016/10/28 by Rolando.Caloca
DR - vk - Move UB ring buffer to context
Change 3179145 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix buffer barriers
Change 3179888 on 2016/10/31 by Rolando.Caloca
DR - vk - Align buffers to 16 bytes as we sometimes write to them with SIMD
Change 3179923 on 2016/10/31 by Rolando.Caloca
DR - vk - Wait for swapchain counter
Change 3180430 on 2016/10/31 by Rolando.Caloca
DR - vk - Properly wait for occlusion queries/cmd buffer
- Actual log error if trying to use occlusion queries out of order
Change 3180746 on 2016/10/31 by Rolando.Caloca
DR - vk - Undo some waiting as it was on the wrong thread
Change 3182115 on 2016/11/01 by Rolando.Caloca
DR - hlslcc Linux path fix
Change 3182118 on 2016/11/01 by Daniel.Wright
Fixed global distance field seam artifacts from landscapes with no subsections
Change 3182368 on 2016/11/01 by Daniel.Wright
Dynamic Indirect Shadows for static meshes using distance fields
* These Distance Field indirect shadows use the same tile culled and downsampled framework that capsule shadows use, with similar GPU cost
* Individual StaticMesh assets can enable bGenerateMeshDistanceField to compute a distance field, without the memory cost of enabling for the whole project
* New StaticMeshComponent properties bCastDynamicIndirectShadow and DynamicIndirectShadowMinVisibility
* New WorldSettings property DynamicIndirectShadowsSelfShadowingIntensity which replaces the cvar
* The GBuffer now stores HasDynamicIndirectShadowCasterRepresentation instead of HasHeightfieldRepresentation
* DFAO from landscape is now done through the global distance field entirely. Landscape contribution to the global distance field is deferred to attempt to workaround texture streaming issues.
Change 3182408 on 2016/11/01 by Rolando.Caloca
DR - vk - Reworked occlusion queries, fixes flickering on AMD
Change 3182585 on 2016/11/01 by Daniel.Wright
PS4 compile fix
Change 3183151 on 2016/11/02 by Rolando.Caloca
DR - vk - Fix issue when processing super quick cmd buffers
Change 3183160 on 2016/11/02 by Rolando.Caloca
Dr - vk - Call reset queries outside render pass
Change 3183182 on 2016/11/02 by Rolando.Caloca
DR - Switch clear
Change 3183194 on 2016/11/02 by Rolando.Caloca
DR - Try to catch crash ahead of time
Change 3183268 on 2016/11/02 by Rolando.Caloca
DR - vk - Rename RenderPassState to TransitionState
Change 3183440 on 2016/11/02 by Daniel.Wright
Renamed 'Dynamic Indirect Shadow' to 'Distance Field Indirect Shadow'
Change 3183793 on 2016/11/02 by Daniel.Wright
Added ShadowResolutionScale to lightcomponent
Change 3183796 on 2016/11/02 by Daniel.Wright
Improved bSimulatePhysics comment, with info on why it might be greyed out
Change 3183797 on 2016/11/02 by Daniel.Wright
Precomputed shadowmaps no longer enable Force2To1Aspect, which is only needed for lightmaps. Improves shadowmap utilization.
Change 3183915 on 2016/11/02 by Rolando.Caloca
DR - vk - Remove redundant renderpasses
Change 3183991 on 2016/11/02 by Daniel.Wright
Added r.ReflectionEnvironmentLightmapMixLargestWeight, useful for restricting lightmap mixing to darkening only
Change 3184001 on 2016/11/02 by Daniel.Wright
Better draw event for IndirectCapsuleShadows in stereo
Change 3184096 on 2016/11/02 by Chris.Bunner
HDR for D3D11 - NVAPI toggle and encoding, UI compositing.
Removed some outdated tonemamping cvars and modes.
Change 3184399 on 2016/11/02 by Daniel.Wright
Static analysis workaround
Change 3184455 on 2016/11/02 by Mark.Satterthwaite
Fix missing log10 from FCompositePS on hlslcc shader platforms so that QA can continue their integration.
#jira UE-38164
Change 3184953 on 2016/11/03 by Chris.Bunner
Fixing CIS warnings.
[CL 3186011 by Marcus Wassmer in Main branch]
2016-11-03 16:55:27 -04:00
int32 GAOGlobalDistanceFieldRepresentHeightfields = 1 ;
FAutoConsoleVariableRef CVarAOGlobalDistanceFieldRepresentHeightfields (
TEXT ( " r.AOGlobalDistanceFieldRepresentHeightfields " ) ,
GAOGlobalDistanceFieldRepresentHeightfields ,
TEXT ( " Whether to put landscape in the global distance field. Changing this won't propagate until the global distance field gets recached (fly away and back). " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3185985)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3170391 on 2016/10/21 by Ben.Woodhouse
Remove the wait on end of frame ensure, because we can't rely on all the the underlying codepaths to never miss a call to flush RHI resources. The consequences of missing a flush on a given frame are not serious now, since we enforce the synchronisation with a fence, preventing the rendering thread from getting too far ahead. We will simply accumulate resources for an additional frame when this happens.
#jira UE-37437
#fyi rolando.caloca, marcus.wassmer
Change 3170659 on 2016/10/21 by Rolando.Caloca
DR - vk - Prep work for state key changes
Change 3170676 on 2016/10/21 by Rolando.Caloca
DR - vk - Reworked blend state keys
- Added depth/stencil to pipeline key
Change 3170848 on 2016/10/21 by Daniel.Wright
Level viewport 'show stats' option is now enabled by default, which avoids confusion with artists thinking lighting is built, when really the message is hidden.
Change 3170849 on 2016/10/21 by Daniel.Wright
Split FProjectedShadowInfo::RenderProjection into smaller functions which make the algorithm structure clear
Change 3170995 on 2016/10/21 by Rolando.Caloca
DR - vk - Show object on vulkan validation msgs
Change 3171085 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix pipelines being used with incompatible renderpasses
Change 3171159 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix layout when reading textures on CPU
Change 3171167 on 2016/10/21 by Rolando.Caloca
DR - vk - compile fix
Change 3172462 on 2016/10/24 by Daniel.Wright
Added a warning about shader compile times to the material tooltip
Change 3172463 on 2016/10/24 by Daniel.Wright
Reduced MinUnoccludedFraction to avoid artitfacts when a stationary light touches only a tiny part of a mesh
Change 3172716 on 2016/10/24 by Brian.Karis
Fix for crash UE-37369 when reimporting over a generated LOD.
Change 3172967 on 2016/10/24 by Rolando.Caloca
DR - vk - Fix writing buffers while GPU was using them
Change 3174187 on 2016/10/25 by Olaf.Piesche
UE-37020
Change 3174718 on 2016/10/26 by Rolando.Caloca
DR - vk - Remove old timestamp queries, increase occlusion queries per pool to 4k
Change 3175960 on 2016/10/26 by Rolando.Caloca
DR - Added support for hlslcc header to have custom parsing
Change 3176611 on 2016/10/27 by David.Hill
DrawWireCone confusion:
In response to a UDN, I'm updating confusing parameter names and comments for
DrawWireCone() and DrawWireSphereCappedCone()
Change 3177111 on 2016/10/27 by Rolando.Caloca
DR - vk - Fix timestamps for frame
Change 3177192 on 2016/10/27 by Arne.Schober
DR - DitherLOD refactor - moved computation of the DepthStencil state out of SetMeshRenderState into GetDitheredLODTransitionState this is a prerequisite of further PSO work where we want to move up State setting in a similar war and reuse FMeshDrawingRenderState
Change 3177278 on 2016/10/27 by Olaf.Piesche
UE-37484
Change 3177297 on 2016/10/27 by Rolando.Caloca
DR - vk - Enable GRHISupportsBaseVertexIndex
Change 3177607 on 2016/10/27 by Rolando.Caloca
DR - vk - SM4 UB prep
Change 3178052 on 2016/10/28 by Arne.Schober
DR - fix WebGL - the WebGL compiler is very picky on double underscores and does want the presission to be defined before any function definition.
Change 3178156 on 2016/10/28 by Rolando.Caloca
DR - vk - Added query timer
- Fixed inline issues
Change 3178158 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for out of stencil bits
Change 3178462 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for Elemental
Change 3179131 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix for r.Vulkan.UseRealUBs
Change 3179139 on 2016/10/28 by Rolando.Caloca
DR - vk - Move UB ring buffer to context
Change 3179145 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix buffer barriers
Change 3179888 on 2016/10/31 by Rolando.Caloca
DR - vk - Align buffers to 16 bytes as we sometimes write to them with SIMD
Change 3179923 on 2016/10/31 by Rolando.Caloca
DR - vk - Wait for swapchain counter
Change 3180430 on 2016/10/31 by Rolando.Caloca
DR - vk - Properly wait for occlusion queries/cmd buffer
- Actual log error if trying to use occlusion queries out of order
Change 3180746 on 2016/10/31 by Rolando.Caloca
DR - vk - Undo some waiting as it was on the wrong thread
Change 3182115 on 2016/11/01 by Rolando.Caloca
DR - hlslcc Linux path fix
Change 3182118 on 2016/11/01 by Daniel.Wright
Fixed global distance field seam artifacts from landscapes with no subsections
Change 3182368 on 2016/11/01 by Daniel.Wright
Dynamic Indirect Shadows for static meshes using distance fields
* These Distance Field indirect shadows use the same tile culled and downsampled framework that capsule shadows use, with similar GPU cost
* Individual StaticMesh assets can enable bGenerateMeshDistanceField to compute a distance field, without the memory cost of enabling for the whole project
* New StaticMeshComponent properties bCastDynamicIndirectShadow and DynamicIndirectShadowMinVisibility
* New WorldSettings property DynamicIndirectShadowsSelfShadowingIntensity which replaces the cvar
* The GBuffer now stores HasDynamicIndirectShadowCasterRepresentation instead of HasHeightfieldRepresentation
* DFAO from landscape is now done through the global distance field entirely. Landscape contribution to the global distance field is deferred to attempt to workaround texture streaming issues.
Change 3182408 on 2016/11/01 by Rolando.Caloca
DR - vk - Reworked occlusion queries, fixes flickering on AMD
Change 3182585 on 2016/11/01 by Daniel.Wright
PS4 compile fix
Change 3183151 on 2016/11/02 by Rolando.Caloca
DR - vk - Fix issue when processing super quick cmd buffers
Change 3183160 on 2016/11/02 by Rolando.Caloca
Dr - vk - Call reset queries outside render pass
Change 3183182 on 2016/11/02 by Rolando.Caloca
DR - Switch clear
Change 3183194 on 2016/11/02 by Rolando.Caloca
DR - Try to catch crash ahead of time
Change 3183268 on 2016/11/02 by Rolando.Caloca
DR - vk - Rename RenderPassState to TransitionState
Change 3183440 on 2016/11/02 by Daniel.Wright
Renamed 'Dynamic Indirect Shadow' to 'Distance Field Indirect Shadow'
Change 3183793 on 2016/11/02 by Daniel.Wright
Added ShadowResolutionScale to lightcomponent
Change 3183796 on 2016/11/02 by Daniel.Wright
Improved bSimulatePhysics comment, with info on why it might be greyed out
Change 3183797 on 2016/11/02 by Daniel.Wright
Precomputed shadowmaps no longer enable Force2To1Aspect, which is only needed for lightmaps. Improves shadowmap utilization.
Change 3183915 on 2016/11/02 by Rolando.Caloca
DR - vk - Remove redundant renderpasses
Change 3183991 on 2016/11/02 by Daniel.Wright
Added r.ReflectionEnvironmentLightmapMixLargestWeight, useful for restricting lightmap mixing to darkening only
Change 3184001 on 2016/11/02 by Daniel.Wright
Better draw event for IndirectCapsuleShadows in stereo
Change 3184096 on 2016/11/02 by Chris.Bunner
HDR for D3D11 - NVAPI toggle and encoding, UI compositing.
Removed some outdated tonemamping cvars and modes.
Change 3184399 on 2016/11/02 by Daniel.Wright
Static analysis workaround
Change 3184455 on 2016/11/02 by Mark.Satterthwaite
Fix missing log10 from FCompositePS on hlslcc shader platforms so that QA can continue their integration.
#jira UE-38164
Change 3184953 on 2016/11/03 by Chris.Bunner
Fixing CIS warnings.
[CL 3186011 by Marcus Wassmer in Main branch]
2016-11-03 16:55:27 -04:00
) ;
2020-01-10 05:58:31 -05:00
float GGlobalDistanceFieldHeightFieldThicknessScale = 4.0f ;
FAutoConsoleVariableRef CVarGlobalDistanceFieldHeightFieldThicknessScale (
TEXT ( " r.GlobalDistanceFieldHeightFieldThicknessScale " ) ,
GGlobalDistanceFieldHeightFieldThicknessScale ,
TEXT ( " Thickness of the height field when it's entered into the global distance field, measured in distance field voxels. Defaults to 4 which means 4x the voxel size as thickness. " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
2020-01-10 05:58:31 -05:00
) ;
2020-07-06 18:58:26 -04:00
float GAOGlobalDistanceFieldMinMeshSDFRadius = 20 ;
FAutoConsoleVariableRef CVarAOGlobalDistanceFieldMinMeshSDFRadius (
TEXT ( " r.AOGlobalDistanceField.MinMeshSDFRadius " ) ,
GAOGlobalDistanceFieldMinMeshSDFRadius ,
TEXT ( " Meshes with a smaller world space radius than this are culled from the global SDF. " ) ,
ECVF_RenderThreadSafe
) ;
float GAOGlobalDistanceFieldMinMeshSDFRadiusInVoxels = .5f ;
FAutoConsoleVariableRef CVarAOGlobalDistanceFieldMinMeshSDFRadiusInVoxels (
TEXT ( " r.AOGlobalDistanceField.MinMeshSDFRadiusInVoxels " ) ,
GAOGlobalDistanceFieldMinMeshSDFRadiusInVoxels ,
TEXT ( " Meshes with a smaller radius than this number of voxels are culled from the global SDF. " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
2020-07-06 18:58:26 -04:00
) ;
float GAOGlobalDistanceFieldCameraPositionVelocityOffsetDecay = .7f ;
FAutoConsoleVariableRef CVarAOGlobalDistanceFieldCameraPositionVelocityOffsetDecay (
TEXT ( " r.AOGlobalDistanceField.CameraPositionVelocityOffsetDecay " ) ,
GAOGlobalDistanceFieldCameraPositionVelocityOffsetDecay ,
TEXT ( " " ) ,
ECVF_Scalability | ECVF_RenderThreadSafe
) ;
int32 GAOGlobalDistanceFieldFastCameraMode = 0 ;
FAutoConsoleVariableRef CVarAOGlobalDistanceFieldFastCameraMode (
TEXT ( " r.AOGlobalDistanceField.FastCameraMode " ) ,
GAOGlobalDistanceFieldFastCameraMode ,
TEXT ( " Whether to update the Global SDF for fast camera movement - lower quality, faster updates so lighting can keep up with the camera. " ) ,
ECVF_RenderThreadSafe
) ;
2022-04-22 19:55:41 -04:00
static TAutoConsoleVariable < int32 > CVarAOGlobalDistanceFieldAverageCulledObjectsPerCell (
TEXT ( " r.AOGlobalDistanceField.AverageCulledObjectsPerCell " ) ,
512 ,
TEXT ( " Average expected number of objects per cull grid cell, used to preallocate memory for the cull grid. " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
2020-09-08 17:44:06 -04:00
) ;
2020-09-15 11:03:59 -04:00
int32 GAOGlobalDistanceFieldMipFactor = 4 ;
FAutoConsoleVariableRef CVarAOGlobalDistanceFieldMipFactor (
TEXT ( " r.AOGlobalDistanceField.MipFactor " ) ,
GAOGlobalDistanceFieldMipFactor ,
TEXT ( " Resolution divider for the mip map of a distance field clipmap. " ) ,
2020-10-19 10:34:37 -04:00
ECVF_Scalability | ECVF_RenderThreadSafe
2020-09-15 11:03:59 -04:00
) ;
2022-03-01 21:07:45 -05:00
float GLumenSceneGlobalSDFFullyCoveredExpandSurfaceScale = 1.0f ;
FAutoConsoleVariableRef CVarLumenSceneGlobalSDFFullyCoveredExpandSurfaceScale (
TEXT ( " r.LumenScene.GlobalSDF.FullyCoveredExpandSurfaceScale " ) ,
GLumenSceneGlobalSDFFullyCoveredExpandSurfaceScale ,
TEXT ( " Scales the half voxel SDF expand used by the Global SDF to reconstruct surfaces that are thinner than the distance between two voxels, erring on the side of over-occlusion. " ) ,
ECVF_Scalability | ECVF_RenderThreadSafe
) ;
float GLumenSceneGlobalSDFUncoveredExpandSurfaceScale = .6f ;
FAutoConsoleVariableRef CVarLumenScenGlobalSDFUncoveredExpandSurfaceScale (
TEXT ( " r.LumenScene.GlobalSDF.UncoveredExpandSurfaceScale " ) ,
GLumenSceneGlobalSDFUncoveredExpandSurfaceScale ,
TEXT ( " Scales the half voxel SDF expand used by the Global SDF to reconstruct surfaces that are thinner than the distance between two voxels, for regions of space that only contain Two Sided Mesh SDFs. " ) ,
ECVF_Scalability | ECVF_RenderThreadSafe
) ;
float GLumenSceneGlobalSDFUncoveredMinStepScale = 4.0f ;
FAutoConsoleVariableRef CVarLumenScenGlobalSDFUncoveredMinStepScale (
TEXT ( " r.LumenScene.GlobalSDF.UncoveredMinStepScale " ) ,
GLumenSceneGlobalSDFUncoveredMinStepScale ,
TEXT ( " Scales the min step size to improve performance, for regions of space that only contain Two Sided Mesh SDFs. " ) ,
ECVF_Scalability | ECVF_RenderThreadSafe
) ;
2022-04-22 19:55:41 -04:00
TAutoConsoleVariable < int32 > CVarGlobalDistanceFieldDebug (
TEXT ( " r.GlobalDistanceField.Debug " ) ,
0 ,
TEXT ( " Debug drawing for the Global Distance Field. Requires r.ShaderPrint=1. " ) ,
ECVF_Scalability | ECVF_RenderThreadSafe
) ;
2021-12-10 18:08:26 -05:00
FGlobalDistanceFieldParameters2 SetupGlobalDistanceFieldParameters ( const FGlobalDistanceFieldParameterData & ParameterData )
{
FGlobalDistanceFieldParameters2 ShaderParameters ;
ShaderParameters . GlobalDistanceFieldPageAtlasTexture = OrBlack3DIfNull ( ParameterData . PageAtlasTexture ) ;
2022-03-01 21:07:45 -05:00
ShaderParameters . GlobalDistanceFieldCoverageAtlasTexture = OrBlack3DIfNull ( ParameterData . CoverageAtlasTexture ) ;
2022-01-26 08:27:36 -05:00
ShaderParameters . GlobalDistanceFieldPageTableTexture = OrBlack3DUintIfNull ( ParameterData . PageTableTexture ) ;
2021-12-10 18:08:26 -05:00
ShaderParameters . GlobalDistanceFieldMipTexture = OrBlack3DIfNull ( ParameterData . MipTexture ) ;
2022-04-22 19:55:41 -04:00
for ( int32 Index = 0 ; Index < GlobalDistanceField : : MaxClipmaps ; Index + + )
2021-12-10 18:08:26 -05:00
{
ShaderParameters . GlobalVolumeCenterAndExtent [ Index ] = ParameterData . CenterAndExtent [ Index ] ;
ShaderParameters . GlobalVolumeWorldToUVAddAndMul [ Index ] = ParameterData . WorldToUVAddAndMul [ Index ] ;
ShaderParameters . GlobalDistanceFieldMipWorldToUVScale [ Index ] = ParameterData . MipWorldToUVScale [ Index ] ;
ShaderParameters . GlobalDistanceFieldMipWorldToUVBias [ Index ] = ParameterData . MipWorldToUVBias [ Index ] ;
}
ShaderParameters . GlobalDistanceFieldMipFactor = ParameterData . MipFactor ;
ShaderParameters . GlobalDistanceFieldMipTransition = ParameterData . MipTransition ;
ShaderParameters . GlobalDistanceFieldClipmapSizeInPages = ParameterData . ClipmapSizeInPages ;
2022-02-02 07:59:31 -05:00
ShaderParameters . GlobalDistanceFieldInvPageAtlasSize = ( FVector3f ) ParameterData . InvPageAtlasSize ;
2022-03-01 21:07:45 -05:00
ShaderParameters . GlobalDistanceFieldInvCoverageAtlasSize = ( FVector3f ) ParameterData . InvCoverageAtlasSize ;
2021-12-10 18:08:26 -05:00
ShaderParameters . GlobalVolumeDimension = ParameterData . GlobalDFResolution ;
ShaderParameters . GlobalVolumeTexelSize = 1.0f / ParameterData . GlobalDFResolution ;
ShaderParameters . MaxGlobalDFAOConeDistance = ParameterData . MaxDFAOConeDistance ;
ShaderParameters . NumGlobalSDFClipmaps = ParameterData . NumGlobalSDFClipmaps ;
2022-03-01 21:07:45 -05:00
ShaderParameters . FullyCoveredExpandSurfaceScale = GLumenSceneGlobalSDFFullyCoveredExpandSurfaceScale ;
ShaderParameters . UncoveredExpandSurfaceScale = GLumenSceneGlobalSDFUncoveredExpandSurfaceScale ;
ShaderParameters . UncoveredMinStepScale = GLumenSceneGlobalSDFUncoveredMinStepScale ;
2021-12-10 18:08:26 -05:00
return ShaderParameters ;
}
2020-07-06 18:58:26 -04:00
float GetMinMeshSDFRadius ( float VoxelWorldSize )
{
float MinRadius = GAOGlobalDistanceFieldMinMeshSDFRadius * ( GAOGlobalDistanceFieldFastCameraMode ? 10.0f : 1.0f ) ;
float MinVoxelRadius = GAOGlobalDistanceFieldMinMeshSDFRadiusInVoxels * VoxelWorldSize * ( GAOGlobalDistanceFieldFastCameraMode ? 5.0f : 1.0f ) ;
return FMath : : Max ( MinRadius , MinVoxelRadius ) ;
}
int32 GetNumClipmapUpdatesPerFrame ( )
{
return GAOGlobalDistanceFieldFastCameraMode ? 1 : GAOGlobalDistanceFieldClipmapUpdatesPerFrame ;
}
2022-01-26 17:07:27 -05:00
int32 GetNumGlobalDistanceFieldClipmaps ( bool bLumenEnabled , float LumenSceneViewDistance )
2020-07-06 18:58:26 -04:00
{
int32 WantedClipmaps = GAOGlobalDistanceFieldNumClipmaps ;
2022-01-26 17:07:27 -05:00
if ( bLumenEnabled )
{
if ( GlobalDistanceField : : GetClipmapExtent ( WantedClipmaps + 1 , nullptr , true ) < = LumenSceneViewDistance )
{
WantedClipmaps + = 2 ;
}
else if ( GlobalDistanceField : : GetClipmapExtent ( WantedClipmaps , nullptr , true ) < = LumenSceneViewDistance )
{
WantedClipmaps + = 1 ;
}
}
2020-07-06 18:58:26 -04:00
extern int32 GLumenDistantScene ;
if ( GAOGlobalDistanceFieldFastCameraMode & & GLumenDistantScene = = 0 )
{
WantedClipmaps + + ;
}
2022-04-22 19:55:41 -04:00
return FMath : : Clamp < int32 > ( WantedClipmaps , 0 , GlobalDistanceField : : MaxClipmaps ) ;
2020-07-06 18:58:26 -04:00
}
2020-09-08 17:44:06 -04:00
// Global Distance Field Pages
// Must match GlobalDistanceFieldShared.ush
2022-04-22 19:55:41 -04:00
namespace GlobalDistanceField
{
const int32 CullGridFactor = 4 ;
const int32 PageAtlasSizeInPagesX = 128 ;
const int32 PageAtlasSizeInPagesY = 128 ;
}
const int32 GGlobalDistanceFieldPageResolutionInAtlas = 8 ; // Includes 0.5 texel trilinear filter margin
const int32 GGlobalDistanceFieldCoveragePageResolutionInAtlas = 4 ; // Includes 0.5 texel trilinear filter margin
const int32 GGlobalDistanceFieldPageResolution = GGlobalDistanceFieldPageResolutionInAtlas - 1 ;
2020-09-08 17:44:06 -04:00
const int32 GGlobalDistanceFieldInfluenceRangeInVoxels = 4 ;
2021-02-04 15:30:42 -04:00
int32 GlobalDistanceField : : GetClipmapResolution ( bool bLumenEnabled )
2020-09-08 17:44:06 -04:00
{
int32 DFResolution = GAOGlobalDFResolution ;
2021-02-04 15:30:42 -04:00
if ( bLumenEnabled )
2020-09-08 17:44:06 -04:00
{
DFResolution = Lumen : : GetGlobalDFResolution ( ) ;
}
2022-04-22 19:55:41 -04:00
return FMath : : DivideAndRoundUp ( DFResolution , 4 * GGlobalDistanceFieldPageResolution ) * 4 * GGlobalDistanceFieldPageResolution ;
2020-09-08 17:44:06 -04:00
}
2020-09-15 11:03:59 -04:00
int32 GlobalDistanceField : : GetMipFactor ( )
{
return FMath : : Clamp ( GAOGlobalDistanceFieldMipFactor , 1 , 8 ) ;
}
2021-02-04 15:30:42 -04:00
int32 GlobalDistanceField : : GetClipmapMipResolution ( bool bLumenEnabled )
2020-09-15 11:03:59 -04:00
{
2021-02-04 15:30:42 -04:00
return FMath : : DivideAndRoundUp ( GlobalDistanceField : : GetClipmapResolution ( bLumenEnabled ) , GetMipFactor ( ) ) ;
2020-09-15 11:03:59 -04:00
}
2021-02-04 15:30:42 -04:00
float GlobalDistanceField : : GetClipmapExtent ( int32 ClipmapIndex , const FScene * Scene , bool bLumenEnabled )
2020-09-08 17:44:06 -04:00
{
2022-04-22 19:55:41 -04:00
if ( bLumenEnabled )
2020-09-08 17:44:06 -04:00
{
const float InnerClipmapDistance = Lumen : : GetGlobalDFClipmapExtent ( ) ;
2021-01-04 07:59:22 -04:00
return InnerClipmapDistance * FMath : : Pow ( 2.f , ClipmapIndex ) ;
2020-09-08 17:44:06 -04:00
}
else
{
const float InnerClipmapDistance = Scene - > GlobalDistanceFieldViewDistance / FMath : : Pow ( GAOGlobalDFClipmapDistanceExponent , 3 ) ;
return InnerClipmapDistance * FMath : : Pow ( GAOGlobalDFClipmapDistanceExponent , ClipmapIndex ) ;
}
}
2021-02-04 15:30:42 -04:00
uint32 GlobalDistanceField : : GetPageTableClipmapResolution ( bool bLumenEnabled )
2020-09-08 17:44:06 -04:00
{
2021-02-04 15:30:42 -04:00
return FMath : : DivideAndRoundUp ( GlobalDistanceField : : GetClipmapResolution ( bLumenEnabled ) , GGlobalDistanceFieldPageResolution ) ;
2020-09-08 17:44:06 -04:00
}
2022-01-26 17:07:27 -05:00
FIntVector GlobalDistanceField : : GetPageTableTextureResolution ( bool bLumenEnabled , float LumenSceneViewDistance )
2020-09-08 17:44:06 -04:00
{
2022-01-26 17:07:27 -05:00
const int32 NumClipmaps = GetNumGlobalDistanceFieldClipmaps ( bLumenEnabled , LumenSceneViewDistance ) ;
2021-02-04 15:30:42 -04:00
const uint32 PageTableClipmapResolution = GetPageTableClipmapResolution ( bLumenEnabled ) ;
2020-09-08 17:44:06 -04:00
const FIntVector PageTableTextureResolution = FIntVector (
PageTableClipmapResolution ,
PageTableClipmapResolution ,
PageTableClipmapResolution * NumClipmaps ) ;
return PageTableTextureResolution ;
}
2022-01-26 17:07:27 -05:00
FIntVector GlobalDistanceField : : GetPageAtlasSizeInPages ( bool bLumenEnabled , float LumenSceneViewDistance )
2020-09-08 17:44:06 -04:00
{
2022-01-26 17:07:27 -05:00
const FIntVector PageTableTextureResolution = GetPageTableTextureResolution ( bLumenEnabled , LumenSceneViewDistance ) ;
2020-09-08 17:44:06 -04:00
const int32 RequiredNumberOfPages = FMath : : CeilToInt (
PageTableTextureResolution . X * PageTableTextureResolution . Y * PageTableTextureResolution . Z
* ( GAOGlobalDistanceFieldCacheMostlyStaticSeparately ? 2 : 1 )
* FMath : : Clamp ( GGlobalDistanceFieldOccupancyRatio , 0.1f , 1.0f ) ) ;
2022-04-22 19:55:41 -04:00
const int32 RequiredNumberOfPagesInZ = FMath : : DivideAndRoundUp ( RequiredNumberOfPages , GlobalDistanceField : : PageAtlasSizeInPagesX * GlobalDistanceField : : PageAtlasSizeInPagesY ) ;
2020-09-08 17:44:06 -04:00
const FIntVector PageAtlasTextureSizeInPages = FIntVector (
2022-04-22 19:55:41 -04:00
GlobalDistanceField : : PageAtlasSizeInPagesX ,
GlobalDistanceField : : PageAtlasSizeInPagesY ,
2020-09-08 17:44:06 -04:00
RequiredNumberOfPagesInZ ) ;
return PageAtlasTextureSizeInPages ;
}
2022-01-26 17:07:27 -05:00
FIntVector GlobalDistanceField : : GetPageAtlasSize ( bool bLumenEnabled , float LumenSceneViewDistance )
2020-09-08 17:44:06 -04:00
{
2022-01-26 17:07:27 -05:00
const FIntVector PageAtlasTextureSizeInPages = GlobalDistanceField : : GetPageAtlasSizeInPages ( bLumenEnabled , LumenSceneViewDistance ) ;
2020-09-08 17:44:06 -04:00
return PageAtlasTextureSizeInPages * GGlobalDistanceFieldPageResolutionInAtlas ;
}
2022-03-01 21:07:45 -05:00
FIntVector GlobalDistanceField : : GetCoverageAtlasSize ( bool bLumenEnabled , float LumenSceneViewDistance )
{
const FIntVector PageAtlasTextureSizeInPages = GlobalDistanceField : : GetPageAtlasSizeInPages ( bLumenEnabled , LumenSceneViewDistance ) ;
return PageAtlasTextureSizeInPages * GGlobalDistanceFieldCoveragePageResolutionInAtlas ;
}
2022-01-26 17:07:27 -05:00
int32 GlobalDistanceField : : GetMaxPageNum ( bool bLumenEnabled , float LumenSceneViewDistance )
2020-09-08 17:44:06 -04:00
{
2022-01-26 17:07:27 -05:00
const FIntVector PageAtlasTextureSizeInPages = GlobalDistanceField : : GetPageAtlasSizeInPages ( bLumenEnabled , LumenSceneViewDistance ) ;
2020-09-08 17:44:06 -04:00
int32 MaxPageNum = PageAtlasTextureSizeInPages . X * PageAtlasTextureSizeInPages . Y * PageAtlasTextureSizeInPages . Z ;
return MaxPageNum ;
}
2020-02-06 17:56:50 -05:00
// For reading back the distance field data
static FGlobalDistanceFieldReadback * GDFReadbackRequest = nullptr ;
void RequestGlobalDistanceFieldReadback ( FGlobalDistanceFieldReadback * Readback )
{
if ( ensure ( GDFReadbackRequest = = nullptr ) )
{
ensure ( Readback - > ReadbackComplete . IsBound ( ) ) ;
ensure ( Readback - > CallbackThread ! = ENamedThreads : : UnusedAnchor ) ;
GDFReadbackRequest = Readback ;
}
}
2022-01-26 17:07:27 -05:00
void FGlobalDistanceFieldInfo : : UpdateParameterData ( float MaxOcclusionDistance , bool bLumenEnabled , float LumenSceneViewDistance )
2015-05-18 13:21:23 -04:00
{
2020-09-08 17:44:06 -04:00
ParameterData . PageTableTexture = nullptr ;
ParameterData . PageAtlasTexture = nullptr ;
2022-03-01 21:07:45 -05:00
ParameterData . CoverageAtlasTexture = nullptr ;
2020-09-15 11:03:59 -04:00
ParameterData . MipTexture = nullptr ;
2022-01-26 17:07:27 -05:00
ParameterData . MaxPageNum = GlobalDistanceField : : GetMaxPageNum ( bLumenEnabled , LumenSceneViewDistance ) ;
2020-09-08 17:44:06 -04:00
2015-05-18 13:21:23 -04:00
if ( Clipmaps . Num ( ) > 0 )
{
2020-09-08 17:44:06 -04:00
if ( PageAtlasTexture )
{
2022-04-06 18:24:24 -04:00
ParameterData . PageAtlasTexture = PageAtlasTexture - > GetRHI ( ) ;
2020-09-08 17:44:06 -04:00
}
2022-03-01 21:07:45 -05:00
if ( CoverageAtlasTexture )
{
2022-04-06 18:24:24 -04:00
ParameterData . CoverageAtlasTexture = CoverageAtlasTexture - > GetRHI ( ) ;
2022-03-01 21:07:45 -05:00
}
2020-09-08 17:44:06 -04:00
if ( PageTableCombinedTexture )
{
2022-02-14 10:16:26 -05:00
ensureMsgf ( GAOGlobalDistanceFieldCacheMostlyStaticSeparately , TEXT ( " PageTableCombinedTexture should only be allocated when caching mostly static objects separately. " ) ) ;
2022-04-06 18:24:24 -04:00
ParameterData . PageTableTexture = PageTableCombinedTexture - > GetRHI ( ) ;
2020-09-08 17:44:06 -04:00
}
2022-02-14 10:16:26 -05:00
else if ( PageTableLayerTextures [ GDF_Full ] )
2022-02-08 16:53:35 -05:00
{
2022-02-14 10:16:26 -05:00
ensureMsgf ( ! GAOGlobalDistanceFieldCacheMostlyStaticSeparately , TEXT ( " PageTableCombinedTexture should be allocated when caching mostly static objects separately. " ) ) ;
2022-04-06 18:24:24 -04:00
ParameterData . PageTableTexture = PageTableLayerTextures [ GDF_Full ] - > GetRHI ( ) ;
2022-02-08 16:53:35 -05:00
}
2020-09-08 17:44:06 -04:00
2022-02-10 09:28:01 -05:00
FIntVector MipTextureResolution ( 1 , 1 , 1 ) ;
2020-09-15 11:03:59 -04:00
if ( MipTexture )
{
2022-04-06 18:24:24 -04:00
ParameterData . MipTexture = MipTexture - > GetRHI ( ) ;
2022-02-10 09:28:01 -05:00
MipTextureResolution . X = MipTexture - > GetDesc ( ) . Extent . X ;
MipTextureResolution . Y = MipTexture - > GetDesc ( ) . Extent . Y ;
MipTextureResolution . Z = MipTexture - > GetDesc ( ) . Depth ;
2020-09-15 11:03:59 -04:00
}
2022-04-22 19:55:41 -04:00
for ( int32 ClipmapIndex = 0 ; ClipmapIndex < GlobalDistanceField : : MaxClipmaps ; ClipmapIndex + + )
2015-05-18 13:21:23 -04:00
{
if ( ClipmapIndex < Clipmaps . Num ( ) )
{
const FGlobalDistanceFieldClipmap & Clipmap = Clipmaps [ ClipmapIndex ] ;
2022-02-02 07:59:31 -05:00
ParameterData . CenterAndExtent [ ClipmapIndex ] = FVector4f ( ( FVector3f ) Clipmap . Bounds . GetCenter ( ) , Clipmap . Bounds . GetExtent ( ) . X ) ;
2015-05-18 13:21:23 -04:00
// GlobalUV = (WorldPosition - GlobalVolumeCenterAndExtent[ClipmapIndex].xyz + GlobalVolumeScollOffset[ClipmapIndex].xyz) / (GlobalVolumeCenterAndExtent[ClipmapIndex].w * 2) + .5f;
// WorldToUVMul = 1.0f / (GlobalVolumeCenterAndExtent[ClipmapIndex].w * 2)
// WorldToUVAdd = (GlobalVolumeScollOffset[ClipmapIndex].xyz - GlobalVolumeCenterAndExtent[ClipmapIndex].xyz) / (GlobalVolumeCenterAndExtent[ClipmapIndex].w * 2) + .5f
const FVector WorldToUVAdd = ( Clipmap . ScrollOffset - Clipmap . Bounds . GetCenter ( ) ) / ( Clipmap . Bounds . GetExtent ( ) . X * 2 ) + FVector ( .5f ) ;
2022-02-02 07:59:31 -05:00
ParameterData . WorldToUVAddAndMul [ ClipmapIndex ] = FVector4f ( ( FVector3f ) WorldToUVAdd , 1.0f / ( Clipmap . Bounds . GetExtent ( ) . X * 2 ) ) ;
2020-09-15 11:03:59 -04:00
2022-01-27 03:30:41 -05:00
ParameterData . MipWorldToUVScale [ ClipmapIndex ] = FVector3f ( FVector ( 1.0f ) / ( 2.0f * Clipmap . Bounds . GetExtent ( ) ) ) ; // LWC_TODO: precision loss
ParameterData . MipWorldToUVBias [ ClipmapIndex ] = FVector3f ( ( - Clipmap . Bounds . Min ) / ( 2.0f * Clipmap . Bounds . GetExtent ( ) ) ) ; // LWC_TODO: precision loss
2020-09-15 11:03:59 -04:00
ParameterData . MipWorldToUVScale [ ClipmapIndex ] . Z = ParameterData . MipWorldToUVScale [ ClipmapIndex ] . Z / Clipmaps . Num ( ) ;
ParameterData . MipWorldToUVBias [ ClipmapIndex ] . Z = ( ParameterData . MipWorldToUVBias [ ClipmapIndex ] . Z + ClipmapIndex ) / Clipmaps . Num ( ) ;
2022-02-10 09:28:01 -05:00
// MipUV.z min max for correct bilinear filtering
const int32 ClipmapMipResolution = GlobalDistanceField : : GetClipmapMipResolution ( bLumenEnabled ) ;
const float MipUVMinZ = ( ClipmapIndex * ClipmapMipResolution + 0.5f ) / MipTextureResolution . Z ;
const float MipUVMaxZ = ( ClipmapIndex * ClipmapMipResolution + ClipmapMipResolution - 0.5f ) / MipTextureResolution . Z ;
ParameterData . MipWorldToUVScale [ ClipmapIndex ] . W = MipUVMinZ ;
ParameterData . MipWorldToUVBias [ ClipmapIndex ] . W = MipUVMaxZ ;
2015-05-18 13:21:23 -04:00
}
else
{
2021-09-22 10:01:48 -04:00
ParameterData . CenterAndExtent [ ClipmapIndex ] = FVector4f ( 0 ) ;
ParameterData . WorldToUVAddAndMul [ ClipmapIndex ] = FVector4f ( 0 ) ;
2022-02-10 09:28:01 -05:00
ParameterData . MipWorldToUVScale [ ClipmapIndex ] = FVector4f ( 0 ) ;
ParameterData . MipWorldToUVBias [ ClipmapIndex ] = FVector4f ( 0 ) ;
2015-05-18 13:21:23 -04:00
}
}
2020-09-15 11:03:59 -04:00
ParameterData . MipFactor = GlobalDistanceField : : GetMipFactor ( ) ;
ParameterData . MipTransition = ( GGlobalDistanceFieldInfluenceRangeInVoxels + ParameterData . MipFactor / GGlobalDistanceFieldInfluenceRangeInVoxels ) / ( 2.0f * GGlobalDistanceFieldInfluenceRangeInVoxels ) ;
2022-01-26 17:07:27 -05:00
ParameterData . ClipmapSizeInPages = GlobalDistanceField : : GetPageTableTextureResolution ( bLumenEnabled , LumenSceneViewDistance ) . X ;
ParameterData . InvPageAtlasSize = FVector ( 1.0f ) / FVector ( GlobalDistanceField : : GetPageAtlasSize ( bLumenEnabled , LumenSceneViewDistance ) ) ;
2022-03-01 21:07:45 -05:00
ParameterData . InvCoverageAtlasSize = FVector ( 1.0f ) / FVector ( GlobalDistanceField : : GetCoverageAtlasSize ( bLumenEnabled , LumenSceneViewDistance ) ) ;
2021-02-04 15:30:42 -04:00
ParameterData . GlobalDFResolution = GlobalDistanceField : : GetClipmapResolution ( bLumenEnabled ) ;
2016-04-04 18:44:59 -04:00
extern float GAOConeHalfAngle ;
2020-09-08 17:44:06 -04:00
const float MaxClipmapExtentX = Clipmaps [ Clipmaps . Num ( ) - 1 ] . Bounds . GetExtent ( ) . X ;
2021-02-04 15:30:42 -04:00
const float MaxClipmapVoxelSize = ( 2.0f * MaxClipmapExtentX ) / GlobalDistanceField : : GetClipmapResolution ( bLumenEnabled ) ;
2020-09-08 17:44:06 -04:00
float MaxClipmapInfluenceRadius = GGlobalDistanceFieldInfluenceRangeInVoxels * MaxClipmapVoxelSize ;
const float GlobalMaxSphereQueryRadius = FMath : : Min ( MaxOcclusionDistance / ( 1.0f + FMath : : Tan ( GAOConeHalfAngle ) ) , MaxClipmapInfluenceRadius ) ;
ParameterData . MaxDFAOConeDistance = GlobalMaxSphereQueryRadius ;
2020-07-06 18:58:26 -04:00
ParameterData . NumGlobalSDFClipmaps = Clipmaps . Num ( ) ;
2015-05-18 13:21:23 -04:00
}
else
{
FPlatformMemory : : Memzero ( & ParameterData , sizeof ( ParameterData ) ) ;
}
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3091903)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3072947 on 2016/08/01 by Uriel.Doyon
Texture GUIDs are now included in cooked builds, as they are required by the texture streamer to link build data to in game texture.
#review-3072934 @marcus.wassmer
#jira UE-34045
Change 3073301 on 2016/08/02 by Ben.Woodhouse
Fix for large spotlight culling precision issues, reported on UDN by Aaron Jacobs at Double Fine. For a full description, see the UDN post
https://udn.unrealengine.com/questions/305440/shadowed-light-flicker-caused-by-floating-point-pr.html
#jira UE-34052
Change 3073689 on 2016/08/02 by Ben.Woodhouse
Improved skin postprocess - support for full resolution, with diffuse/spec lighting combined into single RGBA (sharing chroma)
Full res lighting gives less temporal AA flickering, sharper diffuse and specular lighting in the surface (since this is now at full resolution), faster postprocessing if using a 64-bit rendertarget (on NV 980Ti).
Checkerboard rendering is controlled via the r.sss.checkerboard cvar. - 0 is off/full res, 1 is checkerboard, 2 is automatic based on scenecolor (non-checkerboard requires 64bit or more rendertarget w/separate alpha)
Tested/profiled on PC, PS4
Change 3074666 on 2016/08/02 by Daniel.Wright
Fixed stationary skylight brightness
Change 3074667 on 2016/08/02 by Daniel.Wright
Fixed r.ReflectionEnvironmentLightmapMixing
Change 3074687 on 2016/08/02 by Daniel.Wright
Disallowed DrawMaterialToRenderTarget and Begin/EndDrawCanvasToRenderTarget in construction scripts, since they don't work in game. Blutilities can be used to do blueprint rendering in the editor.
Change 3075241 on 2016/08/03 by Rolando.Caloca
DR - Fix linux compile issue & static analysis warning
Change 3075746 on 2016/08/03 by Daniel.Wright
Removed bOverride_AntiAliasingMethod and outdated ini references to PP AntiAliasingMethod
Change 3075783 on 2016/08/03 by Ryan.Brucks
#code.review Marcus.Wassmer
Added two material nodes that return Atmospheric Light Vector and Light Direction using:
View.AtmosphericFogSunColor
View.AtmosphericFogSunDirection
Nodes are called:
AtmosphericLightVector
AtmosphericLightColor
Also changed SceneRendering.cpp so that values will be grabbed from directional lights without needing an Atmospheric Fog actor in the scene.
Change 3075969 on 2016/08/03 by Uriel.Doyon
Material GUIDs are not updated anymore when parents or textures change.
Lighting now uses a hash built from the list of parents, textures and shader functions.
#review-3072980 @marcus.wassmer @daniel.wright
Change 3076116 on 2016/08/03 by Ryan.Brucks
#code.review marcus.wassmer
Fixed typo in the Caption of new Nodes "Atmospheric Light Vector" and "Atmospheric Light Color"
Change 3076456 on 2016/08/03 by Rolando.Caloca
DR - Fix geometry shader gl_Layer for SPIR-V
Change 3076730 on 2016/08/03 by Uriel.Doyon
Added user warning logic for the texture streaming build. Ran in MapCheck, BeginPlay and PreSave.
#review-3072984 @marcus.wassmer
Change 3077616 on 2016/08/04 by Daniel.Wright
Planar reflection show flags can now be edited
Change 3077621 on 2016/08/04 by Daniel.Wright
Changed default Planar Reflection DistanceFromPlaneFadeoutEnd from 600 to 100, which reduces artifacts and is a more intuitive initial setting
Change 3077792 on 2016/08/04 by Daniel.Wright
Fixed an unnecessary sky capture caused by the sky light component owned by the default ASkyLight
Change 3077799 on 2016/08/04 by Daniel.Wright
Skip RF_ArchetypeObject for reflection captures
Change 3077876 on 2016/08/04 by Marc.Olano
Noise material perf improvements
Change random number generator for Gradient-ALU (1.7x perf boost), improve speed of Voronoi noise quality level 3.
Removes integer BBS random number generators. Fewer instructions, but too slow to use (see 1.7x perf boost above)
Change 3077884 on 2016/08/04 by Daniel.Wright
Lighting channels can now be edited on components with static mobility, since dynamic lights can still affect them
Change 3078994 on 2016/08/05 by Simon.Tovey
Fix for UE-34241
Scene proxy ptr was being cached during a downcast.
Inside a call to CreateDynamicData, CheckMaterialUsage_Concurrent() was causing the scene proxy to be recreated an so the cached ptr was stale.
I've fixed the immediate issue but recreating the scene proxy here doesn't seem great. Maybe CheckMaterailUsage() should be rethought a bit.
Change 3079162 on 2016/08/05 by Ben.Woodhouse
Fix for jittering in Paper2D. Was caused by override being ignored due to a change in intiialization order for AA settings.
#jira UE-34091
Change 3079613 on 2016/08/05 by Daniel.Wright
New blueprint function ClearRenderTarget2D, which is the only way to set a render target alpha directly
New blueprint function CreateRenderTarget2D
Change 3079708 on 2016/08/05 by Uriel.Doyon
Fixed crash when building texture streaming on some levels.
Change 3079795 on 2016/08/05 by Uriel.Doyon
Fixed issue with instanced static meshes when building texture streaming.
Fixed typo with func "GetNumTextureStreamingPrimitives"
Change 3079806 on 2016/08/05 by Uriel.Doyon
Enabled PerTexture MipBias. The per texture mip bias now resets to 0 when the texture gets required at low resolution.
New scalability setting named "r.Streaming.LimitPoolSizeToVRAM" enabling the PoolSize to be limited the available VRAM
(according to GPoolSizeVRAMPercentage)
#review-3074662 @marcus.wassmer
Change 3082698 on 2016/08/09 by Daniel.Wright
Copy - CreateRenderTarget2D uses a world context object as owner, allows use in a construction script
Change 3082699 on 2016/08/09 by Daniel.Wright
Changed display name for 'Two Sided' shading model to 'Two Sided Foliage' to make it clear what it's intended to be used for
Change 3083909 on 2016/08/10 by Olaf.Piesche
#jira UE-34106
#jira UE-32784
#jira UE-31198
Reset vertex factories on mesh emitters if mesh has been reimported (if mesh package is dirty)
Change 3084645 on 2016/08/10 by Olaf.Piesche
#jira UE-30398
Fix offset added to particle collision locations.
Change 3084709 on 2016/08/10 by Daniel.Wright
Copy - Scene capture alpha is now inverted to match DrawMaterialToRenderTarget, and to allow compositing with existing render target contents
Added CompositeMode to SceneCapture2D, which can be used to addively accumulate or composite instead of the default overwrite behavior
Added bCaptureOnMovement to SceneCapture, which can be disabled so the only source of scene capturing is a manual capture by calling CaptureScene()
Change 3084783 on 2016/08/10 by Rolando.Caloca
DR - Use the first targeted rhi shader platform as the initial RHI to load on Windows
#jira UE-34510
Change 3084958 on 2016/08/10 by Daniel.Wright
Copy - Reverted cl 2938543 "Lightmass now respects owner bHidden, and bCastHiddenShadow" because it did not have backwards compatibility so breaks content using hidden light cards
Change 3086023 on 2016/08/11 by Marcus.Wassmer
Merging //UE4/Dev-Main@3085468 to Dev-Rendering (//UE4/Dev-Rendering)
#test none
Change 3086778 on 2016/08/11 by Ben.Woodhouse
Workaround for fortnite character rendering issue. Enable checkerboard rendering by default until we can fix properly
#jira UE-34561
Change 3087404 on 2016/08/12 by Rolando.Caloca
DR - Upgrade glslang to 1.0.21.1
- Added some more debug output
Change 3087524 on 2016/08/12 by Rolando.Caloca
DR - vk - Fixed StencilRef, fixed size of RHIReadSurfaceFloatData (but still returns dummy data)
Change 3087663 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix for SRGB; support for mip texture views
Change 3087735 on 2016/08/12 by Daniel.Wright
TextureRenderTarget2D's can now be up to 8192^2. Anything over 2048 pops up an 'are you sure' dialog.
Change 3087750 on 2016/08/12 by Rolando.Caloca
DR - vk - Minor renaming in prep for merge
Change 3087813 on 2016/08/12 by Rolando.Caloca
DR - vk - More minor cleanup
Change 3087819 on 2016/08/12 by Chris.Bunner
Check material function input types directly, no need to traverse connected graph.
#jira UE-32134
Change 3087901 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix RT view to use 1 mip
Fix depth buffer component swizzle
Change 3088193 on 2016/08/12 by Daniel.Wright
DFAO and RTDF shadows are enabled in High and Epic scalability settings by default
Change 3088988 on 2016/08/15 by Rolando.Caloca
DR - Add Accessors
Change 3089104 on 2016/08/15 by Olaf.Piesche
#jira UE-34241
Sceneproxy can be nullptr in FDynamicMeshEmitterData::Init if the proxy is being recreated
Change 3089208 on 2016/08/15 by Daniel.Wright
Downsampled separate translucency uses a separate view uniform buffer with correct buffer sizes
* Fixes WorldPosition in downsampled translucency
* View uniform buffer parameters are now cached on the view, to allow recreating the uniform buffer without having to rebuild the entire struct. Currently used by global distance field, downsampled separate translucency.
* Fixed the downsampled translucency depth buffer being full res used together with a smaller color target, now they are both the downsampled res
Change 3089209 on 2016/08/15 by Daniel.Wright
Fixed atmospheric fog on translucency
Change 3089457 on 2016/08/15 by Daniel.Wright
Fixed lighting build failure from UMaterialInstanceDynamic assigned to a mesh that's being exported to Lightmass. The Swarm cache entry is created using the parent's guid, causing multiple MID's with the same parent to acquire a file handle multiple times which fails after the first.
Change 3089549 on 2016/08/15 by Daniel.Wright
UMaterialInterface initializes LightingGuid to something valid - causes UMaterialInstanceDynamic to have a valid LightingGuid so they can be used in lighting builds
Change 3089703 on 2016/08/15 by Daniel.Wright
Custom expression fixup for View.RenderTargetSize
Change 3090546 on 2016/08/16 by Daniel.Wright
Hopeful fix for recycled snapshot view crash
Change 3091202 on 2016/08/16 by Daniel.Wright
Manually clear FViewInfo::CachedViewUniformShaderParameters on creating a snapshot, since memcpy is used to create the snapshot view
[CL 3091931 by Gil Gribb in Main branch]
2016-08-17 11:38:13 -04:00
bInitialized = true ;
2015-05-18 13:21:23 -04:00
}
2015-05-11 20:04:15 -04:00
/** Constructs and adds an update region based on camera movement for the given axis. */
2020-09-08 17:44:06 -04:00
static void AddUpdateBoundsForAxis ( FIntVector MovementInPages ,
2020-07-06 18:58:26 -04:00
const FBox & ClipmapBounds ,
2020-09-08 17:44:06 -04:00
float ClipmapPageSize ,
2020-07-06 18:58:26 -04:00
int32 ComponentIndex ,
TArray < FClipmapUpdateBounds , TInlineAllocator < 64 > > & UpdateBounds )
2015-05-11 20:04:15 -04:00
{
2020-09-08 17:44:06 -04:00
FBox AxisUpdateBounds = ClipmapBounds ;
2015-05-11 20:04:15 -04:00
2020-09-08 17:44:06 -04:00
if ( MovementInPages [ ComponentIndex ] > 0 )
2015-05-11 20:04:15 -04:00
{
// Positive axis movement, set the min of that axis to contain the newly exposed area
2020-09-08 17:44:06 -04:00
AxisUpdateBounds . Min [ ComponentIndex ] = FMath : : Max ( ClipmapBounds . Max [ ComponentIndex ] - MovementInPages [ ComponentIndex ] * ClipmapPageSize , ClipmapBounds . Min [ ComponentIndex ] ) ;
2015-05-11 20:04:15 -04:00
}
2020-09-08 17:44:06 -04:00
else if ( MovementInPages [ ComponentIndex ] < 0 )
2015-05-11 20:04:15 -04:00
{
// Negative axis movement, set the max of that axis to contain the newly exposed area
2020-09-08 17:44:06 -04:00
AxisUpdateBounds . Max [ ComponentIndex ] = FMath : : Min ( ClipmapBounds . Min [ ComponentIndex ] - MovementInPages [ ComponentIndex ] * ClipmapPageSize , ClipmapBounds . Max [ ComponentIndex ] ) ;
2015-05-11 20:04:15 -04:00
}
2020-09-08 17:44:06 -04:00
if ( FMath : : Abs ( MovementInPages [ ComponentIndex ] ) > 0 )
2015-05-11 20:04:15 -04:00
{
2020-10-19 10:34:37 -04:00
const FVector CellCenterAndBilinearFootprintBias = FVector ( ( 1.0f - 0.5f ) * ClipmapPageSize ) ;
UpdateBounds . Add ( FClipmapUpdateBounds ( AxisUpdateBounds . GetCenter ( ) , AxisUpdateBounds . GetExtent ( ) + CellCenterAndBilinearFootprintBias , false ) ) ;
2015-05-11 20:04:15 -04:00
}
}
2020-07-06 18:58:26 -04:00
static void GetUpdateFrequencyForClipmap ( int32 ClipmapIndex , int32 NumClipmaps , int32 & OutFrequency , int32 & OutPhase )
2015-05-11 20:04:15 -04:00
{
2020-07-06 18:58:26 -04:00
if ( ! GAOGlobalDistanceFieldStaggeredUpdates )
2015-05-11 20:04:15 -04:00
{
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3185985)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3170391 on 2016/10/21 by Ben.Woodhouse
Remove the wait on end of frame ensure, because we can't rely on all the the underlying codepaths to never miss a call to flush RHI resources. The consequences of missing a flush on a given frame are not serious now, since we enforce the synchronisation with a fence, preventing the rendering thread from getting too far ahead. We will simply accumulate resources for an additional frame when this happens.
#jira UE-37437
#fyi rolando.caloca, marcus.wassmer
Change 3170659 on 2016/10/21 by Rolando.Caloca
DR - vk - Prep work for state key changes
Change 3170676 on 2016/10/21 by Rolando.Caloca
DR - vk - Reworked blend state keys
- Added depth/stencil to pipeline key
Change 3170848 on 2016/10/21 by Daniel.Wright
Level viewport 'show stats' option is now enabled by default, which avoids confusion with artists thinking lighting is built, when really the message is hidden.
Change 3170849 on 2016/10/21 by Daniel.Wright
Split FProjectedShadowInfo::RenderProjection into smaller functions which make the algorithm structure clear
Change 3170995 on 2016/10/21 by Rolando.Caloca
DR - vk - Show object on vulkan validation msgs
Change 3171085 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix pipelines being used with incompatible renderpasses
Change 3171159 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix layout when reading textures on CPU
Change 3171167 on 2016/10/21 by Rolando.Caloca
DR - vk - compile fix
Change 3172462 on 2016/10/24 by Daniel.Wright
Added a warning about shader compile times to the material tooltip
Change 3172463 on 2016/10/24 by Daniel.Wright
Reduced MinUnoccludedFraction to avoid artitfacts when a stationary light touches only a tiny part of a mesh
Change 3172716 on 2016/10/24 by Brian.Karis
Fix for crash UE-37369 when reimporting over a generated LOD.
Change 3172967 on 2016/10/24 by Rolando.Caloca
DR - vk - Fix writing buffers while GPU was using them
Change 3174187 on 2016/10/25 by Olaf.Piesche
UE-37020
Change 3174718 on 2016/10/26 by Rolando.Caloca
DR - vk - Remove old timestamp queries, increase occlusion queries per pool to 4k
Change 3175960 on 2016/10/26 by Rolando.Caloca
DR - Added support for hlslcc header to have custom parsing
Change 3176611 on 2016/10/27 by David.Hill
DrawWireCone confusion:
In response to a UDN, I'm updating confusing parameter names and comments for
DrawWireCone() and DrawWireSphereCappedCone()
Change 3177111 on 2016/10/27 by Rolando.Caloca
DR - vk - Fix timestamps for frame
Change 3177192 on 2016/10/27 by Arne.Schober
DR - DitherLOD refactor - moved computation of the DepthStencil state out of SetMeshRenderState into GetDitheredLODTransitionState this is a prerequisite of further PSO work where we want to move up State setting in a similar war and reuse FMeshDrawingRenderState
Change 3177278 on 2016/10/27 by Olaf.Piesche
UE-37484
Change 3177297 on 2016/10/27 by Rolando.Caloca
DR - vk - Enable GRHISupportsBaseVertexIndex
Change 3177607 on 2016/10/27 by Rolando.Caloca
DR - vk - SM4 UB prep
Change 3178052 on 2016/10/28 by Arne.Schober
DR - fix WebGL - the WebGL compiler is very picky on double underscores and does want the presission to be defined before any function definition.
Change 3178156 on 2016/10/28 by Rolando.Caloca
DR - vk - Added query timer
- Fixed inline issues
Change 3178158 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for out of stencil bits
Change 3178462 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for Elemental
Change 3179131 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix for r.Vulkan.UseRealUBs
Change 3179139 on 2016/10/28 by Rolando.Caloca
DR - vk - Move UB ring buffer to context
Change 3179145 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix buffer barriers
Change 3179888 on 2016/10/31 by Rolando.Caloca
DR - vk - Align buffers to 16 bytes as we sometimes write to them with SIMD
Change 3179923 on 2016/10/31 by Rolando.Caloca
DR - vk - Wait for swapchain counter
Change 3180430 on 2016/10/31 by Rolando.Caloca
DR - vk - Properly wait for occlusion queries/cmd buffer
- Actual log error if trying to use occlusion queries out of order
Change 3180746 on 2016/10/31 by Rolando.Caloca
DR - vk - Undo some waiting as it was on the wrong thread
Change 3182115 on 2016/11/01 by Rolando.Caloca
DR - hlslcc Linux path fix
Change 3182118 on 2016/11/01 by Daniel.Wright
Fixed global distance field seam artifacts from landscapes with no subsections
Change 3182368 on 2016/11/01 by Daniel.Wright
Dynamic Indirect Shadows for static meshes using distance fields
* These Distance Field indirect shadows use the same tile culled and downsampled framework that capsule shadows use, with similar GPU cost
* Individual StaticMesh assets can enable bGenerateMeshDistanceField to compute a distance field, without the memory cost of enabling for the whole project
* New StaticMeshComponent properties bCastDynamicIndirectShadow and DynamicIndirectShadowMinVisibility
* New WorldSettings property DynamicIndirectShadowsSelfShadowingIntensity which replaces the cvar
* The GBuffer now stores HasDynamicIndirectShadowCasterRepresentation instead of HasHeightfieldRepresentation
* DFAO from landscape is now done through the global distance field entirely. Landscape contribution to the global distance field is deferred to attempt to workaround texture streaming issues.
Change 3182408 on 2016/11/01 by Rolando.Caloca
DR - vk - Reworked occlusion queries, fixes flickering on AMD
Change 3182585 on 2016/11/01 by Daniel.Wright
PS4 compile fix
Change 3183151 on 2016/11/02 by Rolando.Caloca
DR - vk - Fix issue when processing super quick cmd buffers
Change 3183160 on 2016/11/02 by Rolando.Caloca
Dr - vk - Call reset queries outside render pass
Change 3183182 on 2016/11/02 by Rolando.Caloca
DR - Switch clear
Change 3183194 on 2016/11/02 by Rolando.Caloca
DR - Try to catch crash ahead of time
Change 3183268 on 2016/11/02 by Rolando.Caloca
DR - vk - Rename RenderPassState to TransitionState
Change 3183440 on 2016/11/02 by Daniel.Wright
Renamed 'Dynamic Indirect Shadow' to 'Distance Field Indirect Shadow'
Change 3183793 on 2016/11/02 by Daniel.Wright
Added ShadowResolutionScale to lightcomponent
Change 3183796 on 2016/11/02 by Daniel.Wright
Improved bSimulatePhysics comment, with info on why it might be greyed out
Change 3183797 on 2016/11/02 by Daniel.Wright
Precomputed shadowmaps no longer enable Force2To1Aspect, which is only needed for lightmaps. Improves shadowmap utilization.
Change 3183915 on 2016/11/02 by Rolando.Caloca
DR - vk - Remove redundant renderpasses
Change 3183991 on 2016/11/02 by Daniel.Wright
Added r.ReflectionEnvironmentLightmapMixLargestWeight, useful for restricting lightmap mixing to darkening only
Change 3184001 on 2016/11/02 by Daniel.Wright
Better draw event for IndirectCapsuleShadows in stereo
Change 3184096 on 2016/11/02 by Chris.Bunner
HDR for D3D11 - NVAPI toggle and encoding, UI compositing.
Removed some outdated tonemamping cvars and modes.
Change 3184399 on 2016/11/02 by Daniel.Wright
Static analysis workaround
Change 3184455 on 2016/11/02 by Mark.Satterthwaite
Fix missing log10 from FCompositePS on hlslcc shader platforms so that QA can continue their integration.
#jira UE-38164
Change 3184953 on 2016/11/03 by Chris.Bunner
Fixing CIS warnings.
[CL 3186011 by Marcus Wassmer in Main branch]
2016-11-03 16:55:27 -04:00
OutFrequency = 1 ;
OutPhase = 0 ;
2015-05-11 20:04:15 -04:00
}
2020-07-06 18:58:26 -04:00
else if ( GetNumClipmapUpdatesPerFrame ( ) = = 1 )
2015-05-11 20:04:15 -04:00
{
2020-07-06 18:58:26 -04:00
if ( ClipmapIndex = = 0 )
{
OutFrequency = 2 ;
OutPhase = 0 ;
}
else if ( ClipmapIndex = = 1 )
{
OutFrequency = 4 ;
OutPhase = 1 ;
}
else if ( ClipmapIndex = = 2 )
{
OutFrequency = 8 ;
OutPhase = 3 ;
}
else
{
if ( NumClipmaps > 4 )
{
if ( ClipmapIndex = = 3 )
{
OutFrequency = 16 ;
OutPhase = 7 ;
}
else
{
OutFrequency = 16 ;
OutPhase = 15 ;
}
}
else
{
OutFrequency = 8 ;
OutPhase = 7 ;
}
}
2015-05-11 20:04:15 -04:00
}
else
{
2020-07-06 18:58:26 -04:00
if ( ClipmapIndex = = 0 )
{
OutFrequency = 1 ;
OutPhase = 0 ;
}
else if ( ClipmapIndex = = 1 )
{
OutFrequency = 2 ;
OutPhase = 0 ;
}
else if ( ClipmapIndex = = 2 )
{
OutFrequency = 4 ;
OutPhase = 1 ;
}
else
{
if ( NumClipmaps > 4 )
{
if ( ClipmapIndex = = 3 )
{
OutFrequency = 8 ;
OutPhase = 3 ;
}
else
{
OutFrequency = 8 ;
OutPhase = 7 ;
}
}
else
{
OutFrequency = 4 ;
OutPhase = 3 ;
}
}
2015-05-11 20:04:15 -04:00
}
}
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3185985)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3170391 on 2016/10/21 by Ben.Woodhouse
Remove the wait on end of frame ensure, because we can't rely on all the the underlying codepaths to never miss a call to flush RHI resources. The consequences of missing a flush on a given frame are not serious now, since we enforce the synchronisation with a fence, preventing the rendering thread from getting too far ahead. We will simply accumulate resources for an additional frame when this happens.
#jira UE-37437
#fyi rolando.caloca, marcus.wassmer
Change 3170659 on 2016/10/21 by Rolando.Caloca
DR - vk - Prep work for state key changes
Change 3170676 on 2016/10/21 by Rolando.Caloca
DR - vk - Reworked blend state keys
- Added depth/stencil to pipeline key
Change 3170848 on 2016/10/21 by Daniel.Wright
Level viewport 'show stats' option is now enabled by default, which avoids confusion with artists thinking lighting is built, when really the message is hidden.
Change 3170849 on 2016/10/21 by Daniel.Wright
Split FProjectedShadowInfo::RenderProjection into smaller functions which make the algorithm structure clear
Change 3170995 on 2016/10/21 by Rolando.Caloca
DR - vk - Show object on vulkan validation msgs
Change 3171085 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix pipelines being used with incompatible renderpasses
Change 3171159 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix layout when reading textures on CPU
Change 3171167 on 2016/10/21 by Rolando.Caloca
DR - vk - compile fix
Change 3172462 on 2016/10/24 by Daniel.Wright
Added a warning about shader compile times to the material tooltip
Change 3172463 on 2016/10/24 by Daniel.Wright
Reduced MinUnoccludedFraction to avoid artitfacts when a stationary light touches only a tiny part of a mesh
Change 3172716 on 2016/10/24 by Brian.Karis
Fix for crash UE-37369 when reimporting over a generated LOD.
Change 3172967 on 2016/10/24 by Rolando.Caloca
DR - vk - Fix writing buffers while GPU was using them
Change 3174187 on 2016/10/25 by Olaf.Piesche
UE-37020
Change 3174718 on 2016/10/26 by Rolando.Caloca
DR - vk - Remove old timestamp queries, increase occlusion queries per pool to 4k
Change 3175960 on 2016/10/26 by Rolando.Caloca
DR - Added support for hlslcc header to have custom parsing
Change 3176611 on 2016/10/27 by David.Hill
DrawWireCone confusion:
In response to a UDN, I'm updating confusing parameter names and comments for
DrawWireCone() and DrawWireSphereCappedCone()
Change 3177111 on 2016/10/27 by Rolando.Caloca
DR - vk - Fix timestamps for frame
Change 3177192 on 2016/10/27 by Arne.Schober
DR - DitherLOD refactor - moved computation of the DepthStencil state out of SetMeshRenderState into GetDitheredLODTransitionState this is a prerequisite of further PSO work where we want to move up State setting in a similar war and reuse FMeshDrawingRenderState
Change 3177278 on 2016/10/27 by Olaf.Piesche
UE-37484
Change 3177297 on 2016/10/27 by Rolando.Caloca
DR - vk - Enable GRHISupportsBaseVertexIndex
Change 3177607 on 2016/10/27 by Rolando.Caloca
DR - vk - SM4 UB prep
Change 3178052 on 2016/10/28 by Arne.Schober
DR - fix WebGL - the WebGL compiler is very picky on double underscores and does want the presission to be defined before any function definition.
Change 3178156 on 2016/10/28 by Rolando.Caloca
DR - vk - Added query timer
- Fixed inline issues
Change 3178158 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for out of stencil bits
Change 3178462 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for Elemental
Change 3179131 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix for r.Vulkan.UseRealUBs
Change 3179139 on 2016/10/28 by Rolando.Caloca
DR - vk - Move UB ring buffer to context
Change 3179145 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix buffer barriers
Change 3179888 on 2016/10/31 by Rolando.Caloca
DR - vk - Align buffers to 16 bytes as we sometimes write to them with SIMD
Change 3179923 on 2016/10/31 by Rolando.Caloca
DR - vk - Wait for swapchain counter
Change 3180430 on 2016/10/31 by Rolando.Caloca
DR - vk - Properly wait for occlusion queries/cmd buffer
- Actual log error if trying to use occlusion queries out of order
Change 3180746 on 2016/10/31 by Rolando.Caloca
DR - vk - Undo some waiting as it was on the wrong thread
Change 3182115 on 2016/11/01 by Rolando.Caloca
DR - hlslcc Linux path fix
Change 3182118 on 2016/11/01 by Daniel.Wright
Fixed global distance field seam artifacts from landscapes with no subsections
Change 3182368 on 2016/11/01 by Daniel.Wright
Dynamic Indirect Shadows for static meshes using distance fields
* These Distance Field indirect shadows use the same tile culled and downsampled framework that capsule shadows use, with similar GPU cost
* Individual StaticMesh assets can enable bGenerateMeshDistanceField to compute a distance field, without the memory cost of enabling for the whole project
* New StaticMeshComponent properties bCastDynamicIndirectShadow and DynamicIndirectShadowMinVisibility
* New WorldSettings property DynamicIndirectShadowsSelfShadowingIntensity which replaces the cvar
* The GBuffer now stores HasDynamicIndirectShadowCasterRepresentation instead of HasHeightfieldRepresentation
* DFAO from landscape is now done through the global distance field entirely. Landscape contribution to the global distance field is deferred to attempt to workaround texture streaming issues.
Change 3182408 on 2016/11/01 by Rolando.Caloca
DR - vk - Reworked occlusion queries, fixes flickering on AMD
Change 3182585 on 2016/11/01 by Daniel.Wright
PS4 compile fix
Change 3183151 on 2016/11/02 by Rolando.Caloca
DR - vk - Fix issue when processing super quick cmd buffers
Change 3183160 on 2016/11/02 by Rolando.Caloca
Dr - vk - Call reset queries outside render pass
Change 3183182 on 2016/11/02 by Rolando.Caloca
DR - Switch clear
Change 3183194 on 2016/11/02 by Rolando.Caloca
DR - Try to catch crash ahead of time
Change 3183268 on 2016/11/02 by Rolando.Caloca
DR - vk - Rename RenderPassState to TransitionState
Change 3183440 on 2016/11/02 by Daniel.Wright
Renamed 'Dynamic Indirect Shadow' to 'Distance Field Indirect Shadow'
Change 3183793 on 2016/11/02 by Daniel.Wright
Added ShadowResolutionScale to lightcomponent
Change 3183796 on 2016/11/02 by Daniel.Wright
Improved bSimulatePhysics comment, with info on why it might be greyed out
Change 3183797 on 2016/11/02 by Daniel.Wright
Precomputed shadowmaps no longer enable Force2To1Aspect, which is only needed for lightmaps. Improves shadowmap utilization.
Change 3183915 on 2016/11/02 by Rolando.Caloca
DR - vk - Remove redundant renderpasses
Change 3183991 on 2016/11/02 by Daniel.Wright
Added r.ReflectionEnvironmentLightmapMixLargestWeight, useful for restricting lightmap mixing to darkening only
Change 3184001 on 2016/11/02 by Daniel.Wright
Better draw event for IndirectCapsuleShadows in stereo
Change 3184096 on 2016/11/02 by Chris.Bunner
HDR for D3D11 - NVAPI toggle and encoding, UI compositing.
Removed some outdated tonemamping cvars and modes.
Change 3184399 on 2016/11/02 by Daniel.Wright
Static analysis workaround
Change 3184455 on 2016/11/02 by Mark.Satterthwaite
Fix missing log10 from FCompositePS on hlslcc shader platforms so that QA can continue their integration.
#jira UE-38164
Change 3184953 on 2016/11/03 by Chris.Bunner
Fixing CIS warnings.
[CL 3186011 by Marcus Wassmer in Main branch]
2016-11-03 16:55:27 -04:00
/** Staggers clipmap updates so there are only 2 per frame */
2020-07-06 18:58:26 -04:00
static bool ShouldUpdateClipmapThisFrame ( int32 ClipmapIndex , int32 NumClipmaps , int32 GlobalDistanceFieldUpdateIndex )
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3185985)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3170391 on 2016/10/21 by Ben.Woodhouse
Remove the wait on end of frame ensure, because we can't rely on all the the underlying codepaths to never miss a call to flush RHI resources. The consequences of missing a flush on a given frame are not serious now, since we enforce the synchronisation with a fence, preventing the rendering thread from getting too far ahead. We will simply accumulate resources for an additional frame when this happens.
#jira UE-37437
#fyi rolando.caloca, marcus.wassmer
Change 3170659 on 2016/10/21 by Rolando.Caloca
DR - vk - Prep work for state key changes
Change 3170676 on 2016/10/21 by Rolando.Caloca
DR - vk - Reworked blend state keys
- Added depth/stencil to pipeline key
Change 3170848 on 2016/10/21 by Daniel.Wright
Level viewport 'show stats' option is now enabled by default, which avoids confusion with artists thinking lighting is built, when really the message is hidden.
Change 3170849 on 2016/10/21 by Daniel.Wright
Split FProjectedShadowInfo::RenderProjection into smaller functions which make the algorithm structure clear
Change 3170995 on 2016/10/21 by Rolando.Caloca
DR - vk - Show object on vulkan validation msgs
Change 3171085 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix pipelines being used with incompatible renderpasses
Change 3171159 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix layout when reading textures on CPU
Change 3171167 on 2016/10/21 by Rolando.Caloca
DR - vk - compile fix
Change 3172462 on 2016/10/24 by Daniel.Wright
Added a warning about shader compile times to the material tooltip
Change 3172463 on 2016/10/24 by Daniel.Wright
Reduced MinUnoccludedFraction to avoid artitfacts when a stationary light touches only a tiny part of a mesh
Change 3172716 on 2016/10/24 by Brian.Karis
Fix for crash UE-37369 when reimporting over a generated LOD.
Change 3172967 on 2016/10/24 by Rolando.Caloca
DR - vk - Fix writing buffers while GPU was using them
Change 3174187 on 2016/10/25 by Olaf.Piesche
UE-37020
Change 3174718 on 2016/10/26 by Rolando.Caloca
DR - vk - Remove old timestamp queries, increase occlusion queries per pool to 4k
Change 3175960 on 2016/10/26 by Rolando.Caloca
DR - Added support for hlslcc header to have custom parsing
Change 3176611 on 2016/10/27 by David.Hill
DrawWireCone confusion:
In response to a UDN, I'm updating confusing parameter names and comments for
DrawWireCone() and DrawWireSphereCappedCone()
Change 3177111 on 2016/10/27 by Rolando.Caloca
DR - vk - Fix timestamps for frame
Change 3177192 on 2016/10/27 by Arne.Schober
DR - DitherLOD refactor - moved computation of the DepthStencil state out of SetMeshRenderState into GetDitheredLODTransitionState this is a prerequisite of further PSO work where we want to move up State setting in a similar war and reuse FMeshDrawingRenderState
Change 3177278 on 2016/10/27 by Olaf.Piesche
UE-37484
Change 3177297 on 2016/10/27 by Rolando.Caloca
DR - vk - Enable GRHISupportsBaseVertexIndex
Change 3177607 on 2016/10/27 by Rolando.Caloca
DR - vk - SM4 UB prep
Change 3178052 on 2016/10/28 by Arne.Schober
DR - fix WebGL - the WebGL compiler is very picky on double underscores and does want the presission to be defined before any function definition.
Change 3178156 on 2016/10/28 by Rolando.Caloca
DR - vk - Added query timer
- Fixed inline issues
Change 3178158 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for out of stencil bits
Change 3178462 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for Elemental
Change 3179131 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix for r.Vulkan.UseRealUBs
Change 3179139 on 2016/10/28 by Rolando.Caloca
DR - vk - Move UB ring buffer to context
Change 3179145 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix buffer barriers
Change 3179888 on 2016/10/31 by Rolando.Caloca
DR - vk - Align buffers to 16 bytes as we sometimes write to them with SIMD
Change 3179923 on 2016/10/31 by Rolando.Caloca
DR - vk - Wait for swapchain counter
Change 3180430 on 2016/10/31 by Rolando.Caloca
DR - vk - Properly wait for occlusion queries/cmd buffer
- Actual log error if trying to use occlusion queries out of order
Change 3180746 on 2016/10/31 by Rolando.Caloca
DR - vk - Undo some waiting as it was on the wrong thread
Change 3182115 on 2016/11/01 by Rolando.Caloca
DR - hlslcc Linux path fix
Change 3182118 on 2016/11/01 by Daniel.Wright
Fixed global distance field seam artifacts from landscapes with no subsections
Change 3182368 on 2016/11/01 by Daniel.Wright
Dynamic Indirect Shadows for static meshes using distance fields
* These Distance Field indirect shadows use the same tile culled and downsampled framework that capsule shadows use, with similar GPU cost
* Individual StaticMesh assets can enable bGenerateMeshDistanceField to compute a distance field, without the memory cost of enabling for the whole project
* New StaticMeshComponent properties bCastDynamicIndirectShadow and DynamicIndirectShadowMinVisibility
* New WorldSettings property DynamicIndirectShadowsSelfShadowingIntensity which replaces the cvar
* The GBuffer now stores HasDynamicIndirectShadowCasterRepresentation instead of HasHeightfieldRepresentation
* DFAO from landscape is now done through the global distance field entirely. Landscape contribution to the global distance field is deferred to attempt to workaround texture streaming issues.
Change 3182408 on 2016/11/01 by Rolando.Caloca
DR - vk - Reworked occlusion queries, fixes flickering on AMD
Change 3182585 on 2016/11/01 by Daniel.Wright
PS4 compile fix
Change 3183151 on 2016/11/02 by Rolando.Caloca
DR - vk - Fix issue when processing super quick cmd buffers
Change 3183160 on 2016/11/02 by Rolando.Caloca
Dr - vk - Call reset queries outside render pass
Change 3183182 on 2016/11/02 by Rolando.Caloca
DR - Switch clear
Change 3183194 on 2016/11/02 by Rolando.Caloca
DR - Try to catch crash ahead of time
Change 3183268 on 2016/11/02 by Rolando.Caloca
DR - vk - Rename RenderPassState to TransitionState
Change 3183440 on 2016/11/02 by Daniel.Wright
Renamed 'Dynamic Indirect Shadow' to 'Distance Field Indirect Shadow'
Change 3183793 on 2016/11/02 by Daniel.Wright
Added ShadowResolutionScale to lightcomponent
Change 3183796 on 2016/11/02 by Daniel.Wright
Improved bSimulatePhysics comment, with info on why it might be greyed out
Change 3183797 on 2016/11/02 by Daniel.Wright
Precomputed shadowmaps no longer enable Force2To1Aspect, which is only needed for lightmaps. Improves shadowmap utilization.
Change 3183915 on 2016/11/02 by Rolando.Caloca
DR - vk - Remove redundant renderpasses
Change 3183991 on 2016/11/02 by Daniel.Wright
Added r.ReflectionEnvironmentLightmapMixLargestWeight, useful for restricting lightmap mixing to darkening only
Change 3184001 on 2016/11/02 by Daniel.Wright
Better draw event for IndirectCapsuleShadows in stereo
Change 3184096 on 2016/11/02 by Chris.Bunner
HDR for D3D11 - NVAPI toggle and encoding, UI compositing.
Removed some outdated tonemamping cvars and modes.
Change 3184399 on 2016/11/02 by Daniel.Wright
Static analysis workaround
Change 3184455 on 2016/11/02 by Mark.Satterthwaite
Fix missing log10 from FCompositePS on hlslcc shader platforms so that QA can continue their integration.
#jira UE-38164
Change 3184953 on 2016/11/03 by Chris.Bunner
Fixing CIS warnings.
[CL 3186011 by Marcus Wassmer in Main branch]
2016-11-03 16:55:27 -04:00
{
int32 Frequency ;
int32 Phase ;
2020-07-06 18:58:26 -04:00
GetUpdateFrequencyForClipmap ( ClipmapIndex , NumClipmaps , Frequency , Phase ) ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3185985)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3170391 on 2016/10/21 by Ben.Woodhouse
Remove the wait on end of frame ensure, because we can't rely on all the the underlying codepaths to never miss a call to flush RHI resources. The consequences of missing a flush on a given frame are not serious now, since we enforce the synchronisation with a fence, preventing the rendering thread from getting too far ahead. We will simply accumulate resources for an additional frame when this happens.
#jira UE-37437
#fyi rolando.caloca, marcus.wassmer
Change 3170659 on 2016/10/21 by Rolando.Caloca
DR - vk - Prep work for state key changes
Change 3170676 on 2016/10/21 by Rolando.Caloca
DR - vk - Reworked blend state keys
- Added depth/stencil to pipeline key
Change 3170848 on 2016/10/21 by Daniel.Wright
Level viewport 'show stats' option is now enabled by default, which avoids confusion with artists thinking lighting is built, when really the message is hidden.
Change 3170849 on 2016/10/21 by Daniel.Wright
Split FProjectedShadowInfo::RenderProjection into smaller functions which make the algorithm structure clear
Change 3170995 on 2016/10/21 by Rolando.Caloca
DR - vk - Show object on vulkan validation msgs
Change 3171085 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix pipelines being used with incompatible renderpasses
Change 3171159 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix layout when reading textures on CPU
Change 3171167 on 2016/10/21 by Rolando.Caloca
DR - vk - compile fix
Change 3172462 on 2016/10/24 by Daniel.Wright
Added a warning about shader compile times to the material tooltip
Change 3172463 on 2016/10/24 by Daniel.Wright
Reduced MinUnoccludedFraction to avoid artitfacts when a stationary light touches only a tiny part of a mesh
Change 3172716 on 2016/10/24 by Brian.Karis
Fix for crash UE-37369 when reimporting over a generated LOD.
Change 3172967 on 2016/10/24 by Rolando.Caloca
DR - vk - Fix writing buffers while GPU was using them
Change 3174187 on 2016/10/25 by Olaf.Piesche
UE-37020
Change 3174718 on 2016/10/26 by Rolando.Caloca
DR - vk - Remove old timestamp queries, increase occlusion queries per pool to 4k
Change 3175960 on 2016/10/26 by Rolando.Caloca
DR - Added support for hlslcc header to have custom parsing
Change 3176611 on 2016/10/27 by David.Hill
DrawWireCone confusion:
In response to a UDN, I'm updating confusing parameter names and comments for
DrawWireCone() and DrawWireSphereCappedCone()
Change 3177111 on 2016/10/27 by Rolando.Caloca
DR - vk - Fix timestamps for frame
Change 3177192 on 2016/10/27 by Arne.Schober
DR - DitherLOD refactor - moved computation of the DepthStencil state out of SetMeshRenderState into GetDitheredLODTransitionState this is a prerequisite of further PSO work where we want to move up State setting in a similar war and reuse FMeshDrawingRenderState
Change 3177278 on 2016/10/27 by Olaf.Piesche
UE-37484
Change 3177297 on 2016/10/27 by Rolando.Caloca
DR - vk - Enable GRHISupportsBaseVertexIndex
Change 3177607 on 2016/10/27 by Rolando.Caloca
DR - vk - SM4 UB prep
Change 3178052 on 2016/10/28 by Arne.Schober
DR - fix WebGL - the WebGL compiler is very picky on double underscores and does want the presission to be defined before any function definition.
Change 3178156 on 2016/10/28 by Rolando.Caloca
DR - vk - Added query timer
- Fixed inline issues
Change 3178158 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for out of stencil bits
Change 3178462 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for Elemental
Change 3179131 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix for r.Vulkan.UseRealUBs
Change 3179139 on 2016/10/28 by Rolando.Caloca
DR - vk - Move UB ring buffer to context
Change 3179145 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix buffer barriers
Change 3179888 on 2016/10/31 by Rolando.Caloca
DR - vk - Align buffers to 16 bytes as we sometimes write to them with SIMD
Change 3179923 on 2016/10/31 by Rolando.Caloca
DR - vk - Wait for swapchain counter
Change 3180430 on 2016/10/31 by Rolando.Caloca
DR - vk - Properly wait for occlusion queries/cmd buffer
- Actual log error if trying to use occlusion queries out of order
Change 3180746 on 2016/10/31 by Rolando.Caloca
DR - vk - Undo some waiting as it was on the wrong thread
Change 3182115 on 2016/11/01 by Rolando.Caloca
DR - hlslcc Linux path fix
Change 3182118 on 2016/11/01 by Daniel.Wright
Fixed global distance field seam artifacts from landscapes with no subsections
Change 3182368 on 2016/11/01 by Daniel.Wright
Dynamic Indirect Shadows for static meshes using distance fields
* These Distance Field indirect shadows use the same tile culled and downsampled framework that capsule shadows use, with similar GPU cost
* Individual StaticMesh assets can enable bGenerateMeshDistanceField to compute a distance field, without the memory cost of enabling for the whole project
* New StaticMeshComponent properties bCastDynamicIndirectShadow and DynamicIndirectShadowMinVisibility
* New WorldSettings property DynamicIndirectShadowsSelfShadowingIntensity which replaces the cvar
* The GBuffer now stores HasDynamicIndirectShadowCasterRepresentation instead of HasHeightfieldRepresentation
* DFAO from landscape is now done through the global distance field entirely. Landscape contribution to the global distance field is deferred to attempt to workaround texture streaming issues.
Change 3182408 on 2016/11/01 by Rolando.Caloca
DR - vk - Reworked occlusion queries, fixes flickering on AMD
Change 3182585 on 2016/11/01 by Daniel.Wright
PS4 compile fix
Change 3183151 on 2016/11/02 by Rolando.Caloca
DR - vk - Fix issue when processing super quick cmd buffers
Change 3183160 on 2016/11/02 by Rolando.Caloca
Dr - vk - Call reset queries outside render pass
Change 3183182 on 2016/11/02 by Rolando.Caloca
DR - Switch clear
Change 3183194 on 2016/11/02 by Rolando.Caloca
DR - Try to catch crash ahead of time
Change 3183268 on 2016/11/02 by Rolando.Caloca
DR - vk - Rename RenderPassState to TransitionState
Change 3183440 on 2016/11/02 by Daniel.Wright
Renamed 'Dynamic Indirect Shadow' to 'Distance Field Indirect Shadow'
Change 3183793 on 2016/11/02 by Daniel.Wright
Added ShadowResolutionScale to lightcomponent
Change 3183796 on 2016/11/02 by Daniel.Wright
Improved bSimulatePhysics comment, with info on why it might be greyed out
Change 3183797 on 2016/11/02 by Daniel.Wright
Precomputed shadowmaps no longer enable Force2To1Aspect, which is only needed for lightmaps. Improves shadowmap utilization.
Change 3183915 on 2016/11/02 by Rolando.Caloca
DR - vk - Remove redundant renderpasses
Change 3183991 on 2016/11/02 by Daniel.Wright
Added r.ReflectionEnvironmentLightmapMixLargestWeight, useful for restricting lightmap mixing to darkening only
Change 3184001 on 2016/11/02 by Daniel.Wright
Better draw event for IndirectCapsuleShadows in stereo
Change 3184096 on 2016/11/02 by Chris.Bunner
HDR for D3D11 - NVAPI toggle and encoding, UI compositing.
Removed some outdated tonemamping cvars and modes.
Change 3184399 on 2016/11/02 by Daniel.Wright
Static analysis workaround
Change 3184455 on 2016/11/02 by Mark.Satterthwaite
Fix missing log10 from FCompositePS on hlslcc shader platforms so that QA can continue their integration.
#jira UE-38164
Change 3184953 on 2016/11/03 by Chris.Bunner
Fixing CIS warnings.
[CL 3186011 by Marcus Wassmer in Main branch]
2016-11-03 16:55:27 -04:00
return GlobalDistanceFieldUpdateIndex % Frequency = = Phase ;
}
2021-02-04 15:30:42 -04:00
void UpdateGlobalDistanceFieldViewOrigin ( const FViewInfo & View , bool bLumenEnabled )
2020-07-06 18:58:26 -04:00
{
if ( View . ViewState )
{
if ( GAOGlobalDistanceFieldFastCameraMode ! = 0 )
{
FVector & CameraVelocityOffset = View . ViewState - > GlobalDistanceFieldCameraVelocityOffset ;
const FVector CameraVelocity = View . ViewMatrices . GetViewOrigin ( ) - View . PrevViewInfo . ViewMatrices . GetViewOrigin ( ) ;
// Framerate independent decay
2021-12-02 23:53:56 -05:00
CameraVelocityOffset = CameraVelocityOffset * FMath : : Pow ( GAOGlobalDistanceFieldCameraPositionVelocityOffsetDecay , View . Family - > Time . GetDeltaWorldTimeSeconds ( ) ) + CameraVelocity ;
2020-07-06 18:58:26 -04:00
const FScene * Scene = ( const FScene * ) View . Family - > Scene ;
2022-01-26 17:07:27 -05:00
const int32 NumClipmaps = GetNumGlobalDistanceFieldClipmaps ( bLumenEnabled , View . FinalPostProcessSettings . LumenSceneViewDistance ) ;
2020-07-06 18:58:26 -04:00
if ( Scene & & NumClipmaps > 0 )
{
// Clamp the view origin offset to stay inside the current clipmap extents
2021-02-04 15:30:42 -04:00
const float LargestVoxelClipmapExtent = GlobalDistanceField : : GetClipmapExtent ( NumClipmaps - 1 , Scene , bLumenEnabled ) ;
2020-07-06 18:58:26 -04:00
const float MaxCameraDriftFraction = .75f ;
CameraVelocityOffset . X = FMath : : Clamp < float > ( CameraVelocityOffset . X , - LargestVoxelClipmapExtent * MaxCameraDriftFraction , LargestVoxelClipmapExtent * MaxCameraDriftFraction ) ;
CameraVelocityOffset . Y = FMath : : Clamp < float > ( CameraVelocityOffset . Y , - LargestVoxelClipmapExtent * MaxCameraDriftFraction , LargestVoxelClipmapExtent * MaxCameraDriftFraction ) ;
CameraVelocityOffset . Z = FMath : : Clamp < float > ( CameraVelocityOffset . Z , - LargestVoxelClipmapExtent * MaxCameraDriftFraction , LargestVoxelClipmapExtent * MaxCameraDriftFraction ) ;
}
}
else
{
View . ViewState - > GlobalDistanceFieldCameraVelocityOffset = FVector ( 0.0f , 0.0f , 0.0f ) ;
}
}
}
2021-02-04 15:30:42 -04:00
FVector GetGlobalDistanceFieldViewOrigin ( const FViewInfo & View , int32 ClipmapIndex , bool bLumenEnabled )
2020-07-06 18:58:26 -04:00
{
FVector CameraOrigin = View . ViewMatrices . GetViewOrigin ( ) ;
if ( View . ViewState )
{
FVector CameraVelocityOffset = View . ViewState - > GlobalDistanceFieldCameraVelocityOffset ;
const FScene * Scene = ( const FScene * ) View . Family - > Scene ;
if ( Scene )
{
// Clamp the view origin to stay inside the current clipmap extents
2021-02-04 15:30:42 -04:00
const float ClipmapExtent = GlobalDistanceField : : GetClipmapExtent ( ClipmapIndex , Scene , bLumenEnabled ) ;
2020-07-06 18:58:26 -04:00
const float MaxCameraDriftFraction = .75f ;
CameraVelocityOffset . X = FMath : : Clamp < float > ( CameraVelocityOffset . X , - ClipmapExtent * MaxCameraDriftFraction , ClipmapExtent * MaxCameraDriftFraction ) ;
CameraVelocityOffset . Y = FMath : : Clamp < float > ( CameraVelocityOffset . Y , - ClipmapExtent * MaxCameraDriftFraction , ClipmapExtent * MaxCameraDriftFraction ) ;
CameraVelocityOffset . Z = FMath : : Clamp < float > ( CameraVelocityOffset . Z , - ClipmapExtent * MaxCameraDriftFraction , ClipmapExtent * MaxCameraDriftFraction ) ;
}
CameraOrigin + = CameraVelocityOffset ;
2022-02-10 09:28:01 -05:00
if ( ! View . ViewState - > bGlobalDistanceFieldUpdateViewOrigin )
{
CameraOrigin = View . ViewState - > GlobalDistanceFieldLastViewOrigin ;
}
2020-07-06 18:58:26 -04:00
}
return CameraOrigin ;
}
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 4041614)
#lockdown Nick.Penwarden
============================
MAJOR FEATURES & CHANGES
============================
Change 3774677 by Arne.Schober
DR - Deprecated SetLocal from the RHICmdlist
Fixed some unnecessary PSO collisions.
Change 3809579 by Chris.Bunner
Back out changelist 3774677.
#jira UE-53483
Change 3810363 by Mark.Satterthwaite
More random fixes to mtlpp: most important is the extension to Buffer that allows creation of sub-buffers that are merely views onto a sub-range of the parent. These sub-buffers are valid to use throughout the mtlpp API with two exceptions: they may not be used for visibilityResultsBuffers and Set*BufferOffset functions cannot take this offset into account (as the encoder does not hold onto the buffers and I don't want it to). In the case of Set*BufferOffset the caller has to know what is going on and in the case of visibilityResultsBuffers it'll just assert as it isn't sensible.
This makes it *much* easier to do things like sub-buffer allocation, though the caller must be aware of the alignment restrictions of their intended usage as they are not possible to enforce. For example, a call to SetVertexBuffer requires an offset alignment must match the alignment of the data-type in the shader for "device" resources, or for "constant" data it must be max(4, sizeof(datatype)) on iOS and 256 on macOS. This should allow for much more tightly packed sub-allocations than earlier approaches, though older drivers (e.g. Mac OS X 10.11) enforce only the coarser "constant" data restriction everywhere.
Change 3810407 by Marcus.Wassmer
PR #4322: ShadowSetup Bug Fix: Only stencil mask drawn meshes (Contributed by DSDambuster)
Change 3810676 by Guillaume.Abadie
Makes r.Test.SecondaryUpscaleOverride work with any arbitrary pixel size.
Change 3810696 by Guillaume.Abadie
Adds support for #include "../MyFile.ush" in the shader compiler.
Change 3810698 by Guillaume.Abadie
Implements enum class based shader permutation dimension.
Change 3810699 by Guillaume.Abadie
Implements Diaphragm DOF ground work.
Change 3811536 by Guillaume.Abadie
Pulls the trigger on CircleDOF's setup pass for DiaphragmDOF.
Change 3811958 by Mark.Satterthwaite
More fixes for mtlpp.
Change 3811964 by Mark.Satterthwaite
Only views onto a mtlpp::Buffer should return a valid parent-buffer.
Change 3812604 by Guillaume.Abadie
Changes Diaphragm DOF's source file layout.
Change 3812827 by Mark.Satterthwaite
More missing/broken functionality in mtlpp fixed and fixed obvious leaks.
Change 3812920 by Guillaume.Abadie
Adds support for per mip level UAV in FSceneRenderTarget.
Change 3812926 by Mark.Satterthwaite
Change the way we handle mtlpp resource construction to avoid leaks.
Change 3812960 by Rolando.Caloca
DR - vk - Disable DFGI
Change 3812968 by Rolando.Caloca
DR - Linker fix
Change 3813318 by Mark.Satterthwaite
Fix linear texture allocation from a buffer sub-view.
Change 3813326 by Mark.Satterthwaite
Fix another Metal mtlpp sub-buffer allocation failure.
Change 3813328 by Guillaume.Abadie
Removes global samplers in TAA for GL4, Vulkan and Switch.
Change 3813937 by Rolando.Caloca
DR - Fix logs not getting dumped when r.DumpSCWQueuedJobs is on
Change 3813947 by Rolando.Caloca
DR - noshaderworker should override r.XGEShaderCompile
Change 3817017 by Uriel.Doyon
Fixed texture editor black screen
#jira UE-53653
Change 3818568 by Rolando.Caloca
DR - Fix log when shader jobs crash
- Move log10 to common
- Added COMPILER_VULKAN define
Change 3818603 by Uriel.Doyon
Fix to static analysis warning
Change 3818623 by Rolando.Caloca
DR - Workaround hlslcc loop unrolling bug
Change 3819070 by Uriel.Doyon
Fix to stat duplication.
Change 3819105 by Uriel.Doyon
Refactored volume sample shader to avoid using texture dimension.
Change 3819136 by Rolando.Caloca
DR - vk - Per platform files (empty)
Change 3819180 by Rolando.Caloca
DR - vk - Move defines out of config into per platform
Change 3819247 by Rolando.Caloca
DR - vk - Remove more defines into platform settings
Change 3819318 by Rolando.Caloca
DR - vk - Fixes for linking
Change 3819868 by Rolando.Caloca
DR - vk - Linux & Android fixes
Change 3819873 by Guillaume.Abadie
Adds support for PermutationId on r.DumpShaderDebugInfo=1
Change 3819940 by Rolando.Caloca
DR - vk - Fix Linux issues
Change 3819956 by Rolando.Caloca
DR - vk - Invalid check
Change 3819961 by Michael.Lentine
Hide attributes when plugin is not present
Change 3819980 by Rolando.Caloca
DR - vk - Standard validation always
Change 3820039 by Rolando.Caloca
DR - vk - Fix invalid ensure
Change 3820326 by Rolando.Caloca
DR - vk - Linux compile fix
Change 3820422 by Michael.Lentine
Add back GBufferAO.
Change 3820433 by Rolando.Caloca
DR - Fix D3D12 crash on 20 thread (10x2 cores) machines
Change 3821677 by Rolando.Caloca
DR - vk - Win32 compile fix
Change 3821961 by Rolando.Caloca
DR - Vulkan uses real UB by default on non-Android
Change 3821968 by Rolando.Caloca
DR - vk - Update glslang 1.0.65.1
Change 3821969 by Uriel.Doyon
Added support for stat groups that must be sorted by name. Defined by DECLARE_STATS_GROUP_SORTBYNAME.
Change 3821983 by Rolando.Caloca
DR - vk - Change to static array (0.1ms on 10k draw calls)
Change 3824141 by Rolando.Caloca
DR - vk - Fix static analysis
- Bumped up some (c) 2017->2018
Change 3824355 by Rolando.Caloca
DR - vk - Accessor to find out if a cmd buffer has been submitted
Change 3824420 by Rolando.Caloca
DR - Sanity check number of queries per batch on D3D11 as to not break other RHIs
Change 3824463 by Rolando.Caloca
DR - Removed dummy ensure for D3D12
Change 3824609 by Rolando.Caloca
DR - vk - Linux compile fix
Change 3826074 by Mark.Satterthwaite
Start IMP-caching the various descriptor types in mtlpp.
Change 3826098 by Rolando.Caloca
DR - vk - Dump layer compile fixes
Change 3826113 by Rolando.Caloca
DR - vk - Missing dump functions
Change 3826302 by Rolando.Caloca
DR - vk - Compile fix
- Change dump handles to %p
Change 3826635 by Mark.Satterthwaite
Forward declarations required for mtlpp compilation without exposing Metal headers - plus fixes to the mtlpp test compiler.
Change 3827072 by Mark.Satterthwaite
Switch some more mtlpp descriptors over to IMPTables from objc_msgSend.
Change 3827909 by Guillaume.Abadie
Replaces diaphragm DOF's prefiltering with LDS bank coherent bilateral reduction, and implements 1/8 res background gathering pass.
Change 3827952 by Guillaume.Abadie
Updates copy right to year 2018 on diaphragm DOF's new files.
Change 3828055 by Rolando.Caloca
DR - vk - Rename in prep for changes
Change 3828229 by Guillaume.Abadie
Avoids to log multiple time global shader type name that have multiple permutations when verifying global shader map.
Change 3828427 by Guillaume.Abadie
Reimplements Max3x3 gathering post filtering for Diaphragm DOF with proper shader permutation.
Change 3829979 by Guillaume.Abadie
Fixes a color NaN source in diaphragm DOF's TAA pass.
Change 3830116 by Rolando.Caloca
DR - vk - Fix GPU queries/frame time on old system
- New system in place, disabled temporarily
Change 3830169 by Rolando.Caloca
DR - vk - Fix async pso creation crash
Change 3830193 by Rolando.Caloca
DR - vk - CPU RHI thread improvement
Change 3830291 by Guillaume.Abadie
Automatically lower the number of gathering rings on background half res gather pass as far CoC is getting smaller.
Change 3830300 by Rolando.Caloca
DR - vk - Static analysis fix: Split VulkanCommon.h out of VulkanConfiguration.h
Change 3830589 by Mark.Satterthwaite
In mtlpp cache the IMPTables for all the Metal @protocol's that are dependent on the MTLDevice, this avoids a mutex & map lookup. Also make all the concrete types store their IMPTable statically as it won't change.
Change 3830793 by Mark.Satterthwaite
Fix a small number of bugs introduced with the mtlpp descriptor and table caching.
Change 3831491 by Jian.Ru
Fix driver version unknown
#jira UE-53688
Change 3832335 by Rolando.Caloca
DR - vk - Change include
Change 3832550 by Rolando.Caloca
DR - vk - Occlusion query rewrite WIP
Change 3832589 by Rolando.Caloca
DR - vk - Minor refactor to pools in prep for timestamps
Change 3832618 by Rolando.Caloca
DR - vk - Do not block timestamp queries
Change 3832636 by Rolando.Caloca
DR - vk - Fix old timestamp queries
Change 3833138 by Rolando.Caloca
DR - vk - Fix timestamp queries
Change 3833249 by Rolando.Caloca
DR - vk - Test lock
Change 3833667 by Rolando.Caloca
DR - vk - Old queries wait on the RHI thread now instead of the driver (disabled)
Change 3833907 by Daniel.Wright
Fixed NextStartOffset UAV index out of bounds
Change 3833918 by Daniel.Wright
D3D12 RHI: only refcount uniform buffers if GRHINeedsExtraDeletionLatency is false, which is no longer the case for PC or Xbox. The refcounting was heavy on performance as reported by a licensee because FRHIResource uses atomics for refcounting, which is only necessary when GRHINeedsExtraDeletionLatency is disabled.
Change 3834852 by Rolando.Caloca
DR - vk - Missing file
Change 3834858 by Guillaume.Abadie
Implements r.DOF.MinimalFullresBlurringRadius
Change 3834979 by Rolando.Caloca
DR - vk - Fix
Change 3836117 by Rolando.Caloca
DR - vk - Update to 1.0.65.1
Change 3836122 by Rolando.Caloca
DR - vk - Added r.Vulkan.SubmitOcclusionBatchCmdBuffer
- Added new error codes/messages
Change 3836421 by Mark.Satterthwaite
For the purposes of debugging and conformance testing mtlpp make it possible to compile *without* the IMP cache so that we call the underlying Objective-C.
Change 3836896 by Uriel.Doyon
Fixed concurrency and exit issues around d3d12 pipeline states on windows.
Change 3837385 by Rolando.Caloca
DR - vk - Dump memory on OOM
Change 3837427 by Rolando.Caloca
DR - vk - Change some arrays to array views
Change 3837800 by Guillaume.Abadie
Implements SHADER_PERMUTATION_RANGE_INT to make contiguous integer permutations that does not start to 0.
Change 3838128 by Rolando.Caloca
DR - vk - Support for non-cached memory types
Change 3838540 by Guillaume.Abadie
Refactors Diaphragm DOF's CoC tile buffer under a single API for better maintainability.
Change 3838731 by Rolando.Caloca
DR - vk - Descriptor pools per command buffer pool (turned off)
Change 3838961 by Rolando.Caloca
DR - vk - Use ring buffer for per frame uniform buffers
- Enable descriptor pools per layout recycled per command buffer
Change 3839087 by Rolando.Caloca
DR - vk - Compile fixes for Android
Change 3839106 by Marcus.Wassmer
PR #4413: Removing unnecessary call to FString::ToLower (Contributed by gsfreema)
Change 3839252 by Mark.Satterthwaite
Fix mtlpp::Resource move operators.
Change 3839426 by Marcus.Wassmer
Duplicate 380972
Make PC GPU Benchmarks more reliable
Change 3840041 by Guillaume.Abadie
Fixes shader compilation failure in TAA with alpha channel through post processing support.
Change 3840257 by Chris.Bunner
Swapping a mul() to * in HLSLTranslator::Dot to allow scalar transformations per a UDN ticket.
Change 3840308 by Rolando.Caloca
DR - vk - Support for UB & non-UB on emulation mode
Change 3840586 by Rolando.Caloca
DR - Copy 3840577
Fix for CPUs with more than 16 cores
Change 3840671 by Rolando.Caloca
DR - vk - Copy from 3840663
Fix for layout ensure on HMD projects on Vulkan
Change 3840980 by Rolando.Caloca
DR - vk - Android compile fixes
Change 3841989 by Guillaume.Abadie
Slices Diaphragm DOF's Gather pass in multi shader files, and CFLAG_StandardOptimization flag for faster iteration time.
Change 3842216 by Guillaume.Abadie
Fixes DDOF's foreground alpha channel.
Change 3842217 by Guillaume.Abadie
Implements r.DOF.MaximalForegroundBlurringRadius
Change 3842353 by Guillaume.Abadie
Allows to disable foreground gathering with r.DOF.MaximalForegroundBlurringRadius=0
Change 3842747 by Rolando.Caloca
DR - vk - Missing use of GPoolSizeVRAMPercentage
- Support for smaller allocations if page size is not available
Change 3842791 by Rolando.Caloca
DR - vk - Use 95% of available GPU memory to handle some fragmentation
Change 3843690 by Guillaume.Abadie
Fixes diaphragm DOF's foreground after all this refactoring.
Change 3844439 by Guillaume.Abadie
Improves Coc dilate pass to make the gather pass as fast as possible, but still without artifacts caused by the fast gathering optimisation.
Change 3844946 by Mark.Satterthwaite
rd_route v1.1.1 with attached TPS approval.
For macOS function interposition which is useful for debugging and the occasional workaround.
Change 3845164 by Mark.Satterthwaite
Add LLM support for macOS, including tracking of memory allocated in Objective-C. This makes use of runtime method swizzling in the Objective-C runtime and the rd_route library I added for Richard Wallis, which allows for arbitrary runtime function interposition and allows me to hook the custom allocators used in Apple's many Objective-C frameworks on which the whole macOS edifice is built. Objective-C objects are charged to the calling scope as they are too common to impose their own without murdering frame rate.
We would need a TPS approval for an iOS function interposition library for this to work fully on iOS, if desired in the short term discarding LowLevelFree events that aren't in the map rather than asserting will workaround the problem.
Change 3845849 by Marcus.Wassmer
Fix clang and some normal refactor errors
Change 3846026 by Rolando.Caloca
DR - vk - Descriptor set allocation scheme rewrite
- Type hash for each pool
- Desc sets Pool on device
Change 3846169 by Rolando.Caloca
DR - vk - Remove old code for non-layout descriptor set pools
Change 3846205 by Mark.Satterthwaite
Disambiguate the PatchControlPointOut struct definitions in Metal tessellation shaders at Apple's suggestion to avoid a metallib gotcha.
Change 3846346 by Arne.Schober
DR - Missing Vector instructions
Change 3847037 by Arne.Schober
DR - Fix issue with GPU skincache where the offset of the clothbuffer is not relative to the offset of the actual vertexbuffer.
Fixed MorphTarget Skincache Offset mixxup
Change 3847275 by Marcus.Wassmer
Copying MGPU to Dev-Rendering (//UE4/Dev-Rendering)
Change 3847464 by Rolando.Caloca
DR - vk - Fix static analysis warning
Change 3847707 by Michael.Lentine
Only use MorphTargetOffset when the shader enables morph targets.
Change 3848533 by Richard.Wallis
Handle Metal adding FirstInstance into [[ instance_id ]] which is different to other APIs. SV_InstanceID and SV_VertexID should now have their respective base instance and base vertex ID's subtracted before use in the shader.
#jira UE-51716
Change 3848625 by Richard.Wallis
Compile Fix
Change 3848725 by Rolando.Caloca
DR - Remove use of Build/SetLocalGraphicsPipelineState
Change 3848797 by Rolando.Caloca
DR - Deprecate Build/SetLocalGraphicsPipelineState
Change 3849237 by Arne.Schober
DR - AddCustom Ver for ModelVertex Serialization
Change 3851247 by Rolando.Caloca
DR - vk - Util functions
Change 3851523 by Arne.Schober
DR - Update Reflection Comparission shot from the BuildFarm.
Change 3851859 by Rolando.Caloca
DR - vk - Skip loader
Change 3851889 by Krzysztof.Narkowicz
Removed lights with lighting channels out of tiled deferred light list. Tiled deferred lights do not support lighting channels and it's wasn't worth to add extra complexity to this shader in order support this special case.
#jira UE-51512
Change 3852181 by Rolando.Caloca
DR - vk - Linux compile fix
Change 3852547 by Uriel.Doyon
Fixed Pre-Exposure shader compilation and Temporal AA issue.
#jira UE-54276
Change 3852637 by Arne.Schober
DR - Fixing Normal Automated Test Result
Change 3853167 by Richard.Wallis
AvfPlayer - support for streaming media. Due to an operator new/delete mismatch in Apples CFNetwork - we've had to change out one of that framework allocators using rd_route to avoid the memory corruption.
#jira UE-35637
Change 3853447 by Chris.Bunner
Fixing typos.
Change 3853645 by Krzysztof.Narkowicz
Fixed light functions on subsurface materials
Removed strange code from blending between static and dynamic shadows
#jira UE-50275
Change 3853660 by Rolando.Caloca
DR - Fix OpenGL overwriting texture samplers on forward renderer
Change 3853945 by Mark.Satterthwaite
Duplicate #3831616
Fix the black ground scattering on Metal - we've had issues with the atmospheric fog calculations for a long time - one or more intermediate operations generates different precision on Metal so we end up passing -ve values into sqrt which then generates NaN/INF. For Metal when compiling this file and this file only #define sqrt() to sqrt(abs()) so that we don't see anymore unexpected black in atmospheric rendering. This is far from ideal but I don't want to make abs all inputs into every sqrt because AFAIK this is the only case where we have an issue, and until we to investigate each intermediate calculation that isn't ridiculously, soul-crushingly tedious, it isn't practical to identify the source of the error.
#jira UE-53720
Change 3853966 by Mark.Satterthwaite
Duplicate #3835852
Fix tessellation shaders in Metal with Manual Vertex Fetch enabled:
- The control points idnex buffer shouldn't collide with anything else.
- We can't use the optimisation of loading texture width & height from the buffer meta-table in tessellation shaders as the combined stages don't guarantee not to clobber unused buffer slots and screw it up when we use linear textures.
#jira UE-53851
Change 3854250 by Uriel.Doyon
Fix fbx automation tests
Change 3854736 by Uriel.Doyon
Added a tooltip to the EV100 slider in the exposure menu.
Using game settings now disables the slider.
#jira UE-53945
Change 3855047 by Jian.Ru
Fix DFAO getting NANs when samples out of ViewRect
#jira UE-54403
Change 3858197 by Krzysztof.Narkowicz
View frustum shadow caster culling for pointlights/spotlights
#jira UE-54381
Change 3860081 by Krzysztof.Narkowicz
Tighter bounding sphere for a spotlight
Replaced IntersectSphere(LightProxy->Origin, LightProxy->Radius) with LightProxy->SphereBounds for tighter culling of spotlights
Directional light GetBoundingSphere() now everywhere returns Sphere((0,0,0),HALF_WORLD_MAX) for consistency and proper SphereBounds
#jira UE-54258
Change 3860324 by Mark.Satterthwaite
Update the macOS deployment target version to 10.12 from 10.11 as we officially ended support for El Capitan a while ago. Should mean that libraries compiled for 10.12 and up won't cause link warnings.
Change 3860945 by Arne.Schober
DR - Fix not releaseing SRV on render thread for FPositionVertexBuffer, FStaticMeshVertexBuffer, FColorVertexBuffer, FStaticMeshInstanceBuffer.
#jira UE-54587
Change 3861129 by Jian.Ru
Prevent distance culled objects from casting distance field direct shadows
#jira UE-54533
Change 3861502 by Jian.Ru
Exclude distance culled objects from DFAO calculation
#jira UE-54533
Change 3862243 by Krzysztof.Narkowicz
Changed radius of a directional light's bounding sphere from HALF_WORLD_MAX to WORLD_MAX in order to encopass entire WORLD_MAX box
Change 3863476 by Krzysztof.Narkowicz
Added BuildReflections option to ResavePackages commandlet
#jira UE-54581
Change 3863717 by Rolando.Caloca
DR - vk - Missed using pipeline cache on compute PSOs
Change 3865332 by Arne.Schober
DR - Fix UE-52356 Bone Weight
Change 3866220 by Rolando.Caloca
DR - vk - Fixed GetNativeResource missing on textures
- Added support for -preferNvidia|AMD|Intel
- Added VulkanRHIBridge.h
- Minor fixes
Change 3866222 by Rolando.Caloca
DR - vk - Missed file
Change 3866951 by Krzysztof.Narkowicz
Fixed FreezeRendering on non editor builds: ComputeAndMarkRelevanceForViewParallel was calling FrozenMatricesGuard on multiple threads, reading and writing view matrices state in parallel.
#jira UE-53640
Change 3867231 by Guillaume.Abadie
Adds alpha mode to allow the tonemapper to passthrough the alpha channel for broadcast industry.
Change 3867233 by Guillaume.Abadie
Fixes a compilation failures in TAAU with r.PostProcessing.PropagateAlpha==2
Change 3867594 by Daniel.Wright
Removed EditorOnlyDefaultMaterials, which added 79s of shader compilation during startup
Added a dialog when opening the Material Editor on a Default Material, warning of advanced workflow
Preventing Material Editor Apply or Save for a Default Material when the preview material has compilation errors
Change 3870048 by Daniel.Wright
Cleaned up formatting in TranslucentRendering from merges
Change 3870106 by Krzysztof.Narkowicz
Fixed some FArchive Tell()/Seek() 64bit->32bit truncations
Change 3870211 by Rolando.Caloca
DR - vk - Added -vulkanvalidation=N/-vulkanstandardvalidation/-novulkanstandardvalidation to set validation layer behaviour from cmd line
Change 3870225 by Rolando.Caloca
DR - vk - Some platforms do not use a standard swapchain
Change 3870267 by Arne.Schober
DR - SafeRelease SRVs that might be hold by the Vertexfactories (maybe due to indirect use in GlobalResources)
Note that the VFs are not owners of the data, e.g the underlying Buffers might be released before this and this reference counting should be uneccessary
Change 3870647 by Daniel.Wright
Moved FogRendering.h to Renderer
Change 3872130 by Krzysztof.Narkowicz
Disable USE_GLOBAL_CLIP_PLANE for MATERIAL_DOMAIN_POSTPROCESS and MERIAL_DOMAIN_UI
Merging GitHub Pull request #4459
"When material domain is not needing global clip plane there is no need to generate any code involving it. This does not alter output but removes lot of code at vertex shader and pixel shaders. At least on mobile rendered was actually generating clipping code for ui materials."
#jira UE-54616
Change 3872145 by Rolando.Caloca
DR - vk - Optional SupportsMarkersWithoutExtension
Change 3872404 by Uriel.Doyon
Added some guards when streaming virtual textures.
Fixed optimized UCanvasRenderTarget2D::RepaintCanvas() to prevent resolving the texture twice.
Fixed bad mipmap generation with UCanvasRenderTarget2D.
Change 3872507 by Arne.Schober
Back out changelist 3870267
Change 3874176 by Ben.Marsh
IncludeTool: Add an flag to prevent scanning source files for exported symbols.
Change 3874935 by Krzysztof.Narkowicz
Fixed white thumbnails and other issues with sky lighting on ES3_1 path, by disabling GGX prefiltering, as mobile path doesn't have a single cubemap with all initialized mips. Instead it ping-pongs between 2 partially initialized.
#jira UE-54656
Change 3875710 by Daniel.Wright
Renamed uniform buffer member macros to be much shorter for readability
Change 3876665 by Guillaume.Abadie
Cherry-pick 3870715: Implements DOF's hybrid scatering bare bones.
Change 3876666 by Guillaume.Abadie
Cherry-pick 3871786: DOF hybrid scatering: fixes NaN source, transition to gather on close to screen edge and low intensity.
Change 3876677 by Guillaume.Abadie
Cherry-pick 3872348: Implements neighbor comparison for DOF's scattering compilation pass.
Change 3876680 by Guillaume.Abadie
Cherry-pick 3872357: Oups... fixes build...
Change 3876683 by Guillaume.Abadie
Cherry-pick 3872475: Controls number of mip to generate with DOF's reduce pass.
Change 3876687 by Guillaume.Abadie
Cherry-pick 3874104: Fixes various bugs in diaphragm DOF's hybrid scattering.
Change 3876690 by Guillaume.Abadie
Cherry-pick 3874144: Packs multiple DOF scattering group into same draw instance.
Change 3876694 by Guillaume.Abadie
Cherry-pick 3874275: Switches hybrid scattering with indexed indirect draw call to reduce scatter vertex shader invocation.
Change 3876695 by Guillaume.Abadie
Cherry-pick 3874674: Records min and max coc on DOF's setup's draw event.
Change 3876783 by Rolando.Caloca
DR - Static analysis fix
Change 3876845 by Guillaume.Abadie
Implements USceneCaptureComponent::ProfilingEventName
Change 3877197 by Rolando.Caloca
DR - vk - OQ fixes (disabled)
Change 3877428 by Krzysztof.Narkowicz
Merged with tiny tweaks Ansel photography plugin improvements from Adam Moss (GitHub pull request #4426):
-The free-roaming photography camera has new constraints by default, i.e. it can't pass through walls
-Photography session can be started and stopped programmatically, e.g. making it possible to bind photography to an alternative hotkey or button combo. This was an often-requested feature.
-Tweakables and utilities are now exposed through a Blueprint Function Library (rather than direct manipulation of console variables)
-The Ansel photography session UI now exposes some engine effect tweakables as sliders. For example, if the game is using depth-of-field then sliders are made available to allow the photographer to change the focal depth etc. The developer may suppress this behavior through the Blueprint Function Library.
-Letterboxing is now removed during multi-part capture, d'oh.
-Tiled shots are taken at full resolution even if ScreenPercentage < 100
-SSR is enabled during super-resolution shots since Ansel is now better at hiding any ensuing artifacts
-Postprocess settings are frozen at session start to avoid discontinuities during photography, i.e. wandering between postprocess volumes when the camera auto-moves for stereo and 360 shots.
#jira UE-54244
#4426
Change 3879086 by Krzysztof.Narkowicz
Fixed sky/reflection capture (without owner) update - they are now updated only with a correspoding world
Change 3879090 by Guillaume.Abadie
Fixes tones of regressions on diaphragm DOF's recombine passes.
Change 3879198 by Rolando.Caloca
DR - vk - Support for real uniform buffers on Android platforms
Change 3879993 by Krzysztof.Narkowicz
-Fixed int64->int32 FArchive offset truncation in TShaderMap, VertexFactory and TextureDerivedData
-Fixed FSerializationHistory bug, when trying to serialize 0 bytes
#jira UE-43203
Change 3881462 by Guillaume.Abadie
Implements full res DOF's setup pass for cheaper full res gathering in recombine pass.
Change 3881524 by Krzysztof.Narkowicz
Fixed compilation by removing FTickableEditorObject from FPreviewScene
Change 3881724 by Chris.Bunner
Static analysis fix.
#jira UE-54762
Change 3881861 by Rolando.Caloca
DR - vk - Fix layout warning when generating mip chain
Change 3881864 by Rolando.Caloca
DR - Use render passes on HZB
Change 3882236 by Yuriy.ODonnell
IndirectLightingColorScale is now applied to SubsurfaceLighting and DiffuseLighting. Was previously only applied to DiffuseLighting.
#jira UE-42534
#github 3326
Change 3882325 by Guillaume.Abadie
Implements FocusOnly lower gathering pass for Diaphragm DOF's slight out focus temporal stability.
Change 3882340 by Rolando.Caloca
DR - vk - Fix api dump
Change 3882430 by Rolando.Caloca
DR - vk - KHR_maintenance2
Change 3882563 by Rolando.Caloca
DR - Add depth-stencil access mode to PSO initializer
Change 3882929 by Rolando.Caloca
DR - vk - Proper fix for maintenance extension macros
Change 3883087 by Mark.Satterthwaite
Allow disabling VSync in windowed mode for macOS 10.13.4+ and above.
Change 3883597 by Guillaume.Abadie
Collapses full and half res DOF setup passes together.
Change 3883702 by Guillaume.Abadie
Fixes mac's build.
Change 3884747 by Uriel.Doyon
Fix for static analysis warning
Change 3884975 by Rolando.Caloca
DR - vk - Move some platform defines to platform properties
Change 3884988 by Rolando.Caloca
DR - vk - Make an override per platform
Change 3885832 by Rolando.Caloca
DR - vk - Cosmetic change to group similar members
Change 3885891 by Rolando.Caloca
DR - vk - Some _RenderThread functions to avoid stalls
Change 3886044 by Rolando.Caloca
DR - Added RHI api _RenderThread version of
RHICreateTextureReference
RHICreateShaderLibrary
RHICreateRenderQuery
Change 3886560 by Guillaume.Abadie
Fixes strong aliasing on TAAU's fast shader permutation.
This adds a 6th neighbor sampling, and switch AA_TONE ON as TAA does for its fast shader permutation.
Change 3886749 by Guillaume.Abadie
Cherry-pick 3884748: Implements DOF's BuildBokehLUT for diaphragm blades simulation.
Only used in hybrid scattering for now.
Change 3886750 by Guillaume.Abadie
Cherry-pick 3885457: Simulates diaphragm blades' curvature on bokeh.
Change 3886752 by Rolando.Caloca
DR - Fix metal static analysis
Change 3887460 by Uriel.Doyon
Fixed to more static analysis warning.
Change 3888201 by Rolando.Caloca
DR - vk - Added r.Vulkan.SubmitAfterEveryEndRenderPass
- Fixed bad layout on rendering back buffer
Change 3888209 by Rolando.Caloca
DR - vk - Unity compile fix
Change 3888254 by Rolando.Caloca
DR - vk - Fix async texture layout
Change 3888893 by Guillaume.Abadie
Simulates bokeh in DOF's slight out of focus.
Change 3889085 by Guillaume.Abadie
Fixes DOF's reduce pass sampling outside viewport.
Change 3889924 by Rolando.Caloca
DR - vk - Skip seemingly bad validation error
Change 3890573 by Daniel.Wright
Only initialize FDiaphragmDOFGlobalResource in Feature Level 5
Change 3890590 by Arne.Schober
DR - Fix Paper2d crash. When addMesh is called the Vertex and Indexbuffers are nulled out. re-create Dynamic Mesh builder for every Mesh instead.
#jira UE-55063
Change 3890638 by Arne.Schober
DR - Better fix for Paper2d which honors batching
#jira UE-55063
Change 3891099 by Krzysztof.Narkowicz
1.5 texel shadow offset fix inside Manual2x2PCF based on #4485 GitHub pull request
#jira UE-54985
#4485
Change 3891234 by Krzysztof.Narkowicz
Optimized PCF2x2 and PCF3x3 - merged #4494 GithHub pull request
#jira UE-55121
Change 3891407 by Rolando.Caloca
DR - vk - Set vendor id earlier
Change 3891417 by Rolando.Caloca
DR - vk - Missing layout transitions
Change 3891718 by Arne.Schober
DR - Do not recreate one Frame Resource for dynamic draws
#jira UE-55063
Change 3891925 by Yuriy.ODonnell
Fix/workaround for inconsistent preprocessor definitions for NVAftermath that result in FD3D11DynamicRHI class layout mismatch. NVAftermath support is now enabled by default for Win64.
NVAftermath is declared as a private dependency in D3D11RHI. It does not automatically propagate to modules that explicitly include private RHI headers (OculusHMD, OSVR, OSVRInput). This results in NV_AFTERMATH being defined while compiling RHI module and not defined when compiling other modules, causing memory corruption at runtime.
The long-term solution for this and similar issues requires some mechanism for adding transitive module dependencies, so that anyone that depends on D3D11RHI module would automatically also get the NVAftermath. Additionally, private headers should *never* be included directly by external modules.
The short-term solution is to explicitly add NVAftermath dependency to OculusHMD, OSVR and OSVRInput.
Additionally, NV_AFTERMATH is no longer forced by D3D11RHIPrivate.h when it's not defined. This allows catching this kind of mismatch in the future through a compiler warning (C4668).
#jira UE-53065
Change 3891987 by Rolando.Caloca
DR - vk - Support for dedicated allocations
Change 3892339 by Jian.Ru
Fix a crash when tessellation shaders are used in dx12
#jira UE-55127
Change 3892528 by Rolando.Caloca
DR - vk - Update Linux headers
Change 3892867 by Rolando.Caloca
DR - vk - Don't create swapchain if not needed
Change 3893416 by Guillaume.Abadie
Implements bokeh simmulation on foreground and background gather.
Change 3893732 by Chris.Bunner
GetRelevance_Internal should use the immediate parent resource, not the base, as some features are overridden by permutations e.g. UsesWorldPositionOffset.
#jira UE-53404
Change 3893868 by Guillaume.Abadie
Allocates diaphragm DOF's buffers and structered buffer only on supported platforms.
Change 3893917 by Chris.Bunner
Potential fix for CIS.
Change 3893933 by Chris.Bunner
Duplicating CL 2647737 as this is the same issue from that JIRA where accessing game-thread data was being prevented. We don't have this check in UMaterial::GetMaterialResource already, but presumably the UMaterialInstance case was never removed as we've not been calling it until now.
Change 3894218 by Rolando.Caloca
DR - vk - Remove stat counters per draw call, gains 10% CPU on Infiltrator
Change 3894579 by Arne.Schober
RT - Fix assert not in RenderingThread from Triangle Renderer.
#jira UE-55247
Change 3894724 by Rolando.Caloca
DR - vk - New API for batching barriers
Change 3894909 by Arne.Schober
DR - Fix crash in Speedtree wind where Renderdata is unavailable
#jira UE-54544
Change 3895414 by Rolando.Caloca
DR - Add a configurable threshold for SCWs time outs
Change 3896429 by Marcus.Wassmer
Allow variable frame-latency delay in FrameGrabber frames. For performance you want at least a 1 frame delay so you don't sync the GPU to the CPU.
Change 3896495 by Marcus.Wassmer
Set pointer properly
Fix CIS
Change 3897253 by Guillaume.Abadie
Fixes CIS warning in diaphragm DOF
Change 3899179 by Guillaume.Abadie
Implements background hybrid scatter occlusion for diaphragm DOF.
Change 3903654 by Rolando.Caloca
DR - vk - Rework dump layer to allow other layers
Change 3903766 by Rolando.Caloca
DR - vk - More wrappers
Change 3904025 by Rolando.Caloca
DR - vk - More wrappers
Change 3904342 by Rolando.Caloca
DR - vk - Track image resources & callstacks
Change 3904346 by Rolando.Caloca
DR - vk - Copy fix from 4.19 for flickering grass
Change 3904510 by Rolando.Caloca
DR - vk - Compile fix
Change 3904914 by Daniel.Wright
[Integrate] Fixed PS4 transitions with forward shading
Change 3904916 by Daniel.Wright
[Integrate] Fixed PS4 transitions with occlusion queries
Change 3905975 by Rolando.Caloca
DR - vk - Missing wrappers
Change 3905977 by Rolando.Caloca
DR - vk - Missed file
Change 3907829 by Rolando.Caloca
DR - Move depth bounds to the PSO
Change 3907832 by Rolando.Caloca
DR - vk - Prep for delaying transitions
Change 3907834 by Rolando.Caloca
DR - vk - Fix for depth stencil issues/validation errors
Change 3907967 by Rolando.Caloca
DR - vk - Linux compile
Change 3908093 by Rolando.Caloca
DR - vk - Fix depthstencil layout on descriptors
Change 3908393 by Rolando.Caloca
DR - vk - Disable dedicated allocation as it causes crashes on Nvidia 700 series
Change 3908401 by Rolando.Caloca
DR - Do transitions outside render pass
Change 3908422 by Rolando.Caloca
DR - vk - Fix transition state not getting stored
Change 3908735 by Guillaume.Abadie
Cherry-pick 3896619: Fixes after TAAU post process material that had wrong default buffer UV.
#jira UE-55317
Change 3908736 by Guillaume.Abadie
Cherry-pick 3891352: Fixes ensure when visualizing HDR with TAAU.
#jira UE-55019
Change 3908753 by Guillaume.Abadie
Lets the renderer layout the views in the internal render targets like it prefers.
Change 3909119 by Daniel.Wright
Fix some static analysis warnings
Change 3911943 by Rolando.Caloca
DR - vk - Fix for packaging Vulkan projects
Change 3912145 by Rolando.Caloca
DR - vk - Fix layout on streaming textures
Change 3913029 by Rolando.Caloca
DR - Fix missing transition
Change 3913048 by Rolando.Caloca
DR - Fix for hlslcc
Change 3913054 by Rolando.Caloca
DR - vk - Fix number of layers on barrier
Change 3913171 by Rolando.Caloca
DR - vk - Fix for decal missing transition
Change 3913211 by Rolando.Caloca
DR - vk - Add debug name to image tracking
Change 3913449 by Rolando.Caloca
DR - vk - Restore transition
Change 3913466 by Rolando.Caloca
DR - Fix Vulkan EngineTest
Change 3913537 by Rolando.Caloca
DR - vk - Fixes independent samplers & textures (contributed by AMD)
Change 3913548 by Rolando.Caloca
DR - vk - Warning fix
Change 3913691 by Rolando.Caloca
DR - vk - Fixes for parallel (wip)
Change 3914656 by Rolando.Caloca
DR - vk - Fix bug when using separate samplerstates and textures
Change 3914730 by Rolando.Caloca
DR - vk - Bump version
Change 3914764 by Rolando.Caloca
DR - vk - Don't crash on exit
Change 3915532 by Rolando.Caloca
DR - vk - Parallel context fixes
Change 3915589 by Rolando.Caloca
DR - vk - Hoist and rename transition and layout manager class out of the context
Change 3915592 by Rolando.Caloca
DR - Fix gpu marker name
Change 3917607 by Rolando.Caloca
DR - vk - Fix depth bounds on Vulkan
Change 3917609 by Rolando.Caloca
DR - vk - Fix static analysis
Change 3917616 by Rolando.Caloca
DR - Fix D3D11 initialization
Change 3920569 by Rolando.Caloca
DR - vk - Prep for layout mgr refactor
Change 3921023 by Rolando.Caloca
DR - vk - Dump layer fixes
Change 3921623 by Rolando.Caloca
DR - vk - Prep refactor for layouts
- Dump now shows marker tree
Change 3922007 by Rolando.Caloca
DR - vk - Fix extra allocation per draw call
Change 3922442 by Rolando.Caloca
DR - vk - Detect potential issues
Change 3922470 by Rolando.Caloca
DR - vk - Minor optimization
Change 3922482 by Rolando.Caloca
DR - vk - More minor optimizations
Change 3923158 by Rolando.Caloca
DR - Move r.DisableEngineAndAppRegistration out to common RHI and use it on Vulkan
Change 3923486 by Rolando.Caloca
DR - vk - Minor cpu optimizations
Change 3923505 by Rolando.Caloca
DR - vk - Use bigger allocations for uniform buffers
Change 3923516 by Rolando.Caloca
DR - vk - Android compile fix
Change 3923557 by Rolando.Caloca
DR - vk - Cache descriptorset layouts, refactor duplicated code
Change 3923851 by Rolando.Caloca
DR - vk - Linux compile fix
Change 3924153 by Rolando.Caloca
DR - vk - Support for dynamic UBs
Change 3924193 by Rolando.Caloca
DR - vk - Remove old per pso descriptor pools
Change 3924197 by Rolando.Caloca
DR - vk - Remove unused global uniform buffer pool
Change 3924220 by Rolando.Caloca
DR - vk - Wrap some unused classes in their define
Change 3924234 by Rolando.Caloca
DR - vk - Show ring buffer wrapping messages
Change 3924243 by Rolando.Caloca
DR - vk - Fix bad dynamic buffer
Change 3924902 by Rolando.Caloca
DR - vk - Fix crash running infiltrator
Change 3925209 by Rolando.Caloca
DR - vk - Fix bug with dynamic buffers
- Remove old defines
Change 3925300 by Rolando.Caloca
DR - vk - Allow packed uniforms as dynamic UBs (with r.Vulkan.DynamicGlobalUBs)
Change 3925627 by Rolando.Caloca
DR - vk - Move DynamicOffsets into the pipeline state
Change 3925834 by Rolando.Caloca
DR - vk - Cache per stage information
Change 3925835 by Daniel.Wright
Fixed DisplayName for UParticleModuleCollisionGPU
Change 3925897 by Rolando.Caloca
DR - vk - Split update descriptors loop
Change 3926488 by Rolando.Caloca
DR - vk - 16MB for ring buffer on desktop, 8 MB for mobile
Change 3928168 by Guillaume.Abadie
Cherry-pick 3917219: Implements r.DOF.RecombineQuality
Change 3928173 by Guillaume.Abadie
Cherry-pick 3927888: Enables r.DOF.HybridScatter.BackgroundCompositing and r.DOF.HybridScatter.ForegroundCompositing to work when both enabled.
Change 3928216 by Rolando.Caloca
DR - vk - Fix Android
- Fix static analysis
Change 3929119 by Rolando.Caloca
DR - vk - Rename some classes for clarity
- Fix read-only cvar
Change 3929151 by Rolando.Caloca
DR - vk - Rename class
Change 3930046 by Rolando.Caloca
DR - Temp fix Vulkan flickering grass
Change 3930148 by Rolando.Caloca
DR - vk - Only update dirty descriptors
- Use dynamic descriptors for packed global uniform buffers
Change 3930998 by Guillaume.Abadie
Packs shader permutation in different XGE submissions.
Change 3931079 by Rolando.Caloca
DR - vk - Fixes for Android and non-real ubs platforms
Change 3931942 by Krzysztof.Narkowicz
Depth rendering - When EarlyZPassMode is set to DDM_AllOccluders, dynamic objects need also to test bUseAsOccluder just like static ones
#jira none
Change 3932819 by Daniel.Wright
[Integrate] Scene Textures uniform buffer
* Base Pass Uniform Buffer now contains a Scene Textures uniform buffer. Previously the translucent base pass had to check ~40 loose scene texture parameters every draw.
* FMeshMaterialShader's must now bind PassUniformBuffer and supply a valid pass uniform buffer. For most passes this is just FSceneTextureUniformParameters.
* FRendererModule::DrawTileMesh can now cleanly set dummy scene texture resources, just by configuring how the pass uniform buffer is created.
* Moved scene texture shader functions out of Common, into SceneTexturesCommon which must be manually included by shaders that want to use them
* Separate Mobile Scene Textures uniform buffer to silo the platform complexities
Moved DBuffer inputs out of FDeferredPixelShaderParameters and into FOpaqueBasePassUniformParameters
Removed per-frame material uniform expressions. GameTime material node with period is now implemented with an fmod in the shader, without the use of MaterialFloat, so that it will happen at full precision.
* Per-frame expressions were used when the GameTime material node had a period, to do the fmod on the CPU where 32 bit precision is guaranteed, for mobile GPU's where pixel shader precision is sometimes less than 32fp.
Moved forward shading data into the Base Pass Uniform Buffer
Removed instanced stereo support for the light cull grid - will have to be reimplemented without changing SRV's per draw
Base pass sets View Uniform Buffer from DrawRenderState instead of choosing which one to set per-draw
Fixed padding in nested uniform buffer structs
Skip SRV members on Feature Level SM4 and below
Change 3932964 by Rolando.Caloca
DR - vk - Renderdoc on Android
Change 3933095 by Daniel.Wright
Moved FSceneTextureUniformParameters out of the opaque base pass uniform buffer.
* Base Pass shaders now enable SCENE_TEXTURES_DISABLED when compiling for a material of any domain other than MD_Surface. These are used when rendering thumbnails of a material in a different domain, which could be opaque, but the opaque base pass drawing policy does not bind a scene textures uniform buffer, so the shader must not bind it.
* Opaque materials can no longer use EyeAdaptation.
Change 3933096 by Daniel.Wright
Better d3d11 assert message when a uniform buffer was not set by the renderer
Change 3933176 by Rolando.Caloca
DR - vk - Prefer mailbox if available
Change 3933271 by Ryan.Vance
#jira UE-55936
Fixed missing referenced uniform bindings on AR pass-through camera shaders.
Change 3934000 by Guillaume.Abadie
Fixes Win32 build in ShaderCompilerXGE.cpp
Change 3934299 by Guillaume.Abadie
Fixes a bug in DOF's reduce operator that was casusing color leaking between background and foreground.
Change 3934699 by Daniel.Wright
Added bAffectDistanceFieldLighting to landscape
Change 3935190 by Daniel.Wright
Forward Light Grid SRV's use StructuredBuffer on Metal, instead of 'invariant Buffer', which throws off RemoveUniformBuffersFromSource parsing
Change 3935606 by Daniel.Wright
Removed LightmapPolicy::Set which was needed for vertex lightmaps
Renamed FVertexFactory::Set to SetStreams to make it findable
Change 3936510 by Rolando.Caloca
DR - vk - Update glslangValidator.exe to 1.0.65.1 for dumped debug SPIRV shaders
Change 3936545 by Richard.Wallis
Clone of CL's (3925763, 3925430, 3925424, 3925385, 3925278) Mark Satt's Xcode fixes from task stream //Tasks/UE4/Dev-UERNDR-354-mtlpp/
Plus XCode 9.2 compile fix in ApplicationPlatformCompilerPreSetup.h for -Wunused-lambda-capture.
Change 3938061 by Daniel.Wright
Vulkan: Added support for SRV's in Uniform Buffers
Change 3938123 by Daniel.Wright
Vulkan: Slightly better assert for null resources in uniform buffer
Change 3939197 by Rolando.Caloca
DR - vk - Disable custom memory mgmt
Change 3939677 by Rolando.Caloca
DR - vk - Fix static analysis warning
Change 3939809 by Rolando.Caloca
DR - vk - Fixes for async compute
Change 3939875 by Rolando.Caloca
DR - vk - Support for -vktrace
Change 3939977 by Rolando.Caloca
DR - vk - Skip a condition during gather UBs
- Set up efficient compute async var
- Fix validation cmd line
Change 3939982 by Rolando.Caloca
DR - vk - Revert mipchain
Change 3939984 by Rolando.Caloca
DR - vk - Remove unnecessary asserts
Change 3940082 by Rolando.Caloca
DR - vk - Custom mem mgr
Change 3940475 by Rolando.Caloca
DR - vk - Fix DFAO (indirect draw offset)
Change 3940555 by Rolando.Caloca
DR - vk - Minor fixes
Change 3940675 by Rolando.Caloca
DR - vk - Fix indirect type mismatch
Change 3941111 by Rolando.Caloca
DR - Renderpass bGeneratingMips
Change 3941847 by Daniel.Wright
Fixed Volumetric Lightmaps on Static geometry only working if the geometry had been built with Surface Lightmaps before
Change 3941978 by Rolando.Caloca
DR - vk - Minor fixes for presenting on compute queue
Change 3942074 by Rolando.Caloca
DR - vk - Remove some RHI stalls
- Fixed swap chain stat
Change 3943946 by Daniel.Wright
Fixed Texcoord0 on Volume materials on a particle sprite, including SubUV particles.
Change 3944065 by Daniel.Wright
Fixed SceneDepth collision getting broken on GPU particles when a scene capture is rendering
Change 3944158 by Daniel.Wright
Fixed ViewUniformShaderParameters accessing GEngine->PreIntegratedSkinBRDFTexture too early during slate loading screen
Change 3944865 by Rolando.Caloca
DR - vk - Prep for render passes
Change 3945196 by Rolando.Caloca
DR - Move render pass validate to cpp
Change 3945202 by Rolando.Caloca
DR - vk - Some fixes for using real render passes
Change 3945357 by Rolando.Caloca
DR - Fix bad condition
Change 3946295 by Yuriy.ODonnell
Added a sentinel member to FLightMap, which is initialized in the ctor and reset in the dtor. Sentinel is then checked in FLightCacheInterface::GetLightMapInteraction().
This aims to shed some more light on a hard-to-repro crash, which is suspected to be a use-after-free bug: http://crashreporter/Buggs/Show/1785593
Change 3946407 by Rolando.Caloca
DR - vk - Prep for refactor
Change 3946648 by Rolando.Caloca
DR - vk - Fixes for async compute (wip)
Change 3947299 by Rolando.Caloca
DR - vk - FIx static analysis
Change 3948434 by Rolando.Caloca
DR - vk - Fix exiting with parallel
Change 3948928 by Rolando.Caloca
DR - vk - Fix enabling draw markers for tools
Change 3949021 by Rolando.Caloca
DR - vk - Buffer tracking layer
Change 3949602 by Rolando.Caloca
DR - vk - static analysis fix
Change 3949757 by Rolando.Caloca
DR - vk - Remove bogus parameter
Change 3949810 by Rolando.Caloca
DR - vk - Move waits for cmd buffer
Change 3950270 by Guillaume.Abadie
Implements dedicated gather pass for foreground hole filling to avoid being VGPR bound in foreground gather pass, but still being hable to amend foreground.
Change 3950272 by Rolando.Caloca
DR - vk - Minor refactor for semaphores
Change 3950279 by Guillaume.Abadie
Oups... fixes build
Change 3950298 by Rolando.Caloca
DR - vk - Gather wait semaphores in the cmd buffers
Change 3950371 by Rolando.Caloca
DR - vk - fixes for async compute
Change 3950597 by Rolando.Caloca
DR - vk - Fix for clip distance (fixes planar reflections)
Change 3951075 by Rolando.Caloca
DR - vk - Fix for async compute
Change 3952524 by Guillaume.Abadie
Some DOF enum refactoring.
Change 3955016 by Daniel.Wright
Fixed BuiltData package getting renamed into the map package during a content browser folder move, causing a redirector to be incorrectly placed in the map package
Change 3955668 by Guillaume.Abadie
Fixes a bug where full res coc buffer was computed even if not doing slight out of focus.
Change 3956722 by Guillaume.Abadie
Fixes a bug where r.DOF.MaximalForegroundBlurringRadius was screen percentage dependent.
Change 3959212 by Guillaume.Abadie
Prefixes all DOF's shaders files with DOF keyword.
Change 3959705 by Guillaume.Abadie
Optimises the DOF setup pass outputing half res and full res with LDS downsample.
Change 3959941 by Guillaume.Abadie
Halfs DOF's hybrid scatter compilation by using a unique downsampling for both foreground and background, instead of 2 reduce passes.
Change 3962273 by Rolando.Caloca
DR - Fix typos
#jira UE-56317
PR #4586
Change 3962615 by Rolando.Caloca
DR - vk - Compile fix
Change 3962949 by Rolando.Caloca
DR - Fix DOFDownsample extension
Change 3962993 by Guillaume.Abadie
Back out changelist 3962949
Change 3963016 by Guillaume.Abadie
Adds missing DOFDownsample.usf
Change 3963041 by Rolando.Caloca
DR - vk - Misc changes to help integrate
Change 3964293 by Guillaume.Abadie
Fixes DOF's setup pass reading outside of the viewport.
Change 3964475 by Guillaume.Abadie
Collapses DOF's hybrid scatter compilation passes into reduce passes.
Change 3964883 by Daniel.Wright
Fixed 3d texture in uniform buffer on unsupporting RHI
Change 3964897 by Rolando.Caloca
DR - Compile fixes
Change 3964914 by Guillaume.Abadie
Fixes a bug on r.DOF.RecombineQuality=0
Change 3965153 by Guillaume.Abadie
Fixes compile warning in D3D12Commands.cpp.
Change 3965814 by Rolando.Caloca
DR - Prep for integration conflict resolve
Change 3965899 by Rolando.Caloca
DR - Fix odd linkage issue
Change 3966072 by Rolando.Caloca
DR - More prep for merge
Change 3966163 by Rolando.Caloca
DR - Merge prep
Change 3966844 by Guillaume.Abadie
Packs multiple DOF scattered bokeh per instance and uses PT_RectList in DOF for platforms that can.
Change 3967116 by Rolando.Caloca
DR - Compile fixes for integration
Change 3967273 by Rolando.Caloca
DR - Use same path for mip generation
Change 3967277 by Rolando.Caloca
DR - vk - Fix mips on cubemaps
Change 3967693 by Rolando.Caloca
DR - Copying //UE4/Dev-Main@3912313 to //UE4-DevRendering, missing shaders
Change 3967851 by Rolando.Caloca
DR - Copying //UE4/Dev-Main@3912313 to //UE4-DevRendering, Engine 2/2
Change 3968083 by Rolando.Caloca
DR - Integration compile fixes
Change 3968240 by Rolando.Caloca
DR - Shader compile fixes for integration
Change 3968270 by Rolando.Caloca
DR - Fix for missing hash calculation
Change 3969426 by Rolando.Caloca
DR - vk - Fix warning
Change 3969869 by Krzysztof.Narkowicz
Back out changelist 3946295 - UE-54537 is fixed, so no need for this debug sentinel.
#jira none
Change 3969944 by Rolando.Caloca
DR - Warning fix
Change 3970020 by Rolando.Caloca
DR - Bump after integration
Change 3970052 by Rolando.Caloca
DR - Fix for mobile
Change 3970236 by Daniel.Wright
Causing decal shader to recompile to fix a merge bug
Change 3970270 by Daniel.Wright
Bump shader version from merge
Change 3970339 by Olaf.Piesche
Replace series of locks/unlocks with a single one for curve injection
#tests QAGame
Change 3970390 by Rolando.Caloca
DR - Rename FSceneTextureUniformParameters to FSceneTexturesUniformParameters
- Remove duplicate method for occlusion queries
Change 3970523 by Rolando.Caloca
DR - Fix serialization of shaders
Change 3970533 by Arne.Schober
DR - fix for removing the Speed tree wind when the scene gets deleted. The original enque rendercommand requeues the element onto the renderthread although the call already came from the Renderthread and the scene can get lost in between.
#jira UE-56322
Change 3971160 by Guillaume.Abadie
Fixes CompositeEditorPrimtive pass and SelectionOutline pass for VR editor to work with TAAU.
Change 3971516 by Guillaume.Abadie
Cherry-pick 3912629: Fixes SSR that was computing vigneting according to PrevScreen that could let some outside viewport samples going through when rotating the camera.
#jira UE-55353
Change 3971594 by Krzysztof.Narkowicz
Fixed assert inside BindLightMapVertexBuffer. FSplineMeshSceneProxy was calling BindLightMapVertexBuffer for invalid (still not generated) lightmap UV channel after mesh reimport. Simplified assert, as at the moment almost all of the high callsites already clamp lightmap uv channel.
#jira UE-56321
Change 3971622 by Krzysztof.Narkowicz
Fixed crash inside Indirect Lighting Cache. Data (reflection captures and lightmap) generation calls ULevel::GetOrCreateMapBuildData(), which can destroy lightmap data if level has legacy data. Last Lightmap generation step recreates this data, but if user cancels lightmap generation - it won't do that.
#jira UE-56171
Change 3974788 by Rolando.Caloca
DR - Remove GSupportsGenerateMips
Change 3974789 by Rolando.Caloca
DR - Remove bogus function
Change 3974986 by Rolando.Caloca
DR - vk - Tracking fixes
Change 3974989 by Rolando.Caloca
DR - vk - Don't submit dummy barriers
Change 3975075 by Olaf.Piesche
Update for particle curve injection improvement, fixing ES2 problems
#tests QAGame tm-shadermodels, various color curve tests in-editor
Change 3975957 by Uriel.Doyon
Fixed invalid max texture resolution when using the bake material tools.
Change 3978471 by Daniel.Wright
New cvar r.SkylightUpdateEveryFrame
Change 3978779 by Rolando.Caloca
DR - Accessor for texture sizes
Change 3978797 by Rolando.Caloca
DR - Clean up RHI CopyTexture API
Change 3978832 by Rolando.Caloca
DR - vk - Workaround for RenderDoc crashing due to Descriptor Pool reset
Change 3978836 by Rolando.Caloca
DR - vk - Remove generate mips
Change 3979201 by Rolando.Caloca
DR - vk - RHI CopyTexture. Uses general layout for generating mips
Change 3979204 by Rolando.Caloca
DR - Use render passes and CopyTexture to generate mips
Change 3979592 by Rolando.Caloca
DR - Warning fix
Change 3980855 by Krzysztof.Narkowicz
Optimize bounding sphere radius after non-uniform scale by using bounding box extent.
#jira UE-56227
Change 3981065 by Rolando.Caloca
DR - vk - Fix bad layout
#jira UE-56238
Change 3981346 by Rolando.Caloca
DR - Copy from 3707257
Support for not flushing compute jobs (r.D3D11.UAVFlushNV)
Change 3981347 by Rolando.Caloca
DR - Copy from 3707257
Don't flush between morph dispatched
Change 3981932 by Mark.Satterthwaite
Generate the shader hash and function name when a Metal shader error needs to be reported so that even without shader code we get something to go on.
Change 3982442 by Rolando.Caloca
DR - Fix warning
Change 3982652 by Rolando.Caloca
DR - vk - Signal semaphore cleanup
Change 3983917 by Richard.Wallis
Clone of CL 3974146 converted for mtlpp along with extra mtlpp usage suggestions by Mark Satt:
Fix for black flickering on first paint with weighted material landscape on Mac. When using AsyncCopyFromBufferToTexture in Metal we put the blit operation on the prologue encoder - however after a draw call using that resource the copy operation should happen after on the current encoder, this keeps the correct order of operations.
Added Bool return from various Asnyc renderpass resource requests so caller can decide correct further action. Updated to include the other async functions.
Change 3984409 by Guillaume.Abadie
Attempts to make static analysis happy again.
Change 3984435 by Nick.Bullard
Checking in Performance Test level provided to us by Tor Frick based on UE-44841.
This has been utilized for checking issues against Aftermath performance impact.
The Map includes 2 Level Book marks, most testing has been done against Bookmark 1 view, in fullscreen, in game mode
Change 3985087 by Mark.Satterthwaite
Make sure that the particle scratch buffer is large enough to hold all the data for the curve texture we are rendering to, otherwise a full set of curves will start scribbling memory after 64Kb (the curve texture is 256Kb of data - 512x512x4 as sizeof(RGBAUInt8) == 4). This happens in ElementalDemo.
Change 3985201 by Rolando.Caloca
DR - Fix bad CopyTexture
Change 3985258 by Mark.Satterthwaite
Try and detect orientation changes so that we don't blow-up on iOS due to a huge mismatch between the drawable texture for the display and the scene's depth-stencil target. I can't just fiddle with the depth-stencil texture itself without running the risk of obliterating in-use data and really we shouldn't permit such a mismatch anyway but it is fallout from 3620990.
#jira UE-55756
Change 3986449 by Rolando.Caloca
DR - vk - Update & consolidate Vulkan headers to 1.1.70.1
Consolidate SDK into one
Change 3986571 by Guillaume.Abadie
Makes PVS-Studio happy again in DOF.
Change 3987039 by Yuriy.ODonnell
Initial implementation of tracing profiler to show CPU and multiple GPUs on the same timeline. Currently only supported on DX12 platforms.
Use `TracingProfiler frames=N` console command to trigger a capture of the next N frames. Trace is saved to disk as a JSON file into `Saved/Profiling/Traces` directory.
Trace file uses Google Tracing format and can be visualized in Chrome built-in profiler (chrome://tracing).
`r.GPUStatsChildTimesIncluded=1` CVar makes timing scopes hierarchical.
`TracingProfiler.BufferSize=N` CVar controls the size of the tracing buffer, which may need to be increased for long traces (default is 65k events). Only can be set at startup.
Change 3987074 by Yuriy.ODonnell
Implemented timestamp calibration on DX11. Calibration is only performed when tracing profiler session starts.
Change 3987160 by Yuriy.ODonnell
Added thread naming and ordering to the tracing profiler output
Change 3987331 by Mark.Satterthwaite
Remove the Nvidia hack to retain resource references in command-buffers for UE-46604 as the mtlpp refactor provides stronger resource lifetime guarantees.
#jira UE-46604
Change 3987754 by Mark.Satterthwaite
Fix MetalRHI memory reporting in non-default path.
PR #4568
Change 3988184 by Arciel.Rekman
Linux: Fix editor OpenGL performance (UE-55960).
- GetCurrentThreadId() calls became much more frequent with the OpenGL RHIT refactor.
- We used to only cache that value in monolithic builds, because having per-thread static variables in dynamic libraries is risky due to OS limits.
- This change adds dynamically-managed per-thread cache for non-monolithic builds.
#jira UE-55960
Change 3988394 by Rolando.Caloca
DR - vk - Improve memory mgmt
- Use 256MB pages for Device heap (or 1/8th if less).
- Remove texture allocations not going through resource manager
Change 3988405 by Marcin.Undak
Fix VulkanQuery crash on exit #codereview rolando.caloca #codereview arciel.rekman #rb arciel.rekman
Change 3988567 by Rolando.Caloca
DR - vk - Support for packed global UBs on pci aperture heap
Change 3988668 by Rolando.Caloca
DR - vk - Remove old comments
Change 3988956 by Marcin.Undak
RecordPerformance: added option to skip building/cooking before tests #rb none #codereview arciel.rekman
Change 3989161 by Yuriy.ODonnell
Static analysis error fix
Change 3989196 by Guillaume.Abadie
Fixes a crash in light shaft's TAA pass.
#jira UE-57366
Change 3989207 by Yuriy.ODonnell
Refactored FRealtimeGPUProfilerFrame to avoid splitting profile events when calculating exclusive times of scopes. This allows tracing profiler to retain the hierarchical view of the data, while keeping CSV and GPU Stat system behavior intact.
Change 3989469 by Rolando.Caloca
DR - vk - Fix for bad index; fix for bad transition
Change 3989772 by Yuriy.ODonnell
Implemented timestamp calibration on Vulkan
Change 3990040 by Marcus.Wassmer
Aftermath enabled by default.
Removed unnecessary warning for other vendors
Change 3990064 by Mark.Satterthwaite
Ensure that packed globals are reuploaded when the command-encoder is restarted - don't simply invalidate the existing parameters. This properly handles cases where a single logical render-pass is broken into multiple command-encoders and/or command-buffers - otherwise all shaders must reset all parameters each time. When we move between frames we *do* want to perform a full state reset though as previous frame globals are treated as invalid.
Change 3990080 by Mark.Satterthwaite
Change the way we invalidate the visibility buffer between command-buffers and command-encoders so that on iOS you can reuse the same buffer within the same command-buffer, but not across more than one. The code provides an exception to this rule when running under the MetalRHI validation tools which can break each draw call into its own buffer.
Change 3990084 by Mark.Satterthwaite
Get MetalStatistics compiling again.
Change 3990381 by Arciel.Rekman
Bring back D3D12 in RecordPerformance.
Change 3991113 by Rolando.Caloca
DR - Fix crash on RHI thread on mobile preview
- Check RHI objects are not null in the PSO initializer
Change 3991191 by Ryan.Vance
#jira UE-55952
Reimplemented instanced stereo for forward lighting cull grid after the srv/ub clean up.
Change 3991343 by Rolando.Caloca
DR - Copy from 3911492
UE4 - Disabled parallel mobile bass pass by default. This is experiemental and not known to be useful on any mobile platform.
Change 3991375 by Mark.Satterthwaite
Proper copyright assignment in the mtlpp debugger header.
Change 3993151 by Daniel.Wright
Fix RTDF resource transition found by Rolando
Change 3993818 by Rolando.Caloca
DR - Missed file
Change 3993923 by Krzysztof.Narkowicz
Fixed crashes inside RemoveSpeedTreeWind() and RemoveSpeedTreeWind_RenderThread().
FStaticMeshComponentRecreateRenderStateContext didn't flush deferred render updates causing stale RenderData to be left:
1. Thumbnail manager called SetStaticMesh(nullptr), which added StaticMeshComponent to deferred render updates.
2. UStaticMesh::Build called FStaticMeshComponentRecreateRenderStateContext and destroyed DenderData, but didn't touch Thumbnail's manager StaticMeshComponent as it was nullptr.
3. This resulted in a StaticMeshComponent with stale RenderData pointer.
#jira UE-54544
Change 3994033 by Rolando.Caloca
DR - vk - Reworked layers & extensions, as we were not doing it properly
- Remove -vulkanstandardvalidation and -novulkanstandardvalidation as they are not needed anymore
Change 3994275 by Mark.Satterthwaite
Change to linking against mtlpp via AddEngineThirdPartyPrivateStaticDependencies and marking its header with THIRD_PARTY_* macros in the vain hope that might convince the remote compilation code to distribute the module to the remote machine when building MetalRHI.
#jira UE-57507
Change 3994365 by Mark.Satterthwaite
Pilfer some code from the old MetalHeap file to handle calculating texture memory size on older macOS and iOS builds when running with stats or LLM enabled.
#jira UE-57513
Change 3994382 by Rolando.Caloca
DR - vk - Some missing locks during image tracking
Change 3994422 by Rolando.Caloca
DR - vk - Remove bogus shader format
Change 3995530 by Rolando.Caloca
DR - vk - Fix for crash when validation is enabled
Change 3995531 by Rolando.Caloca
DR - vk - Fix static analysis
Change 3995532 by Rolando.Caloca
DR - vk - Added support for r.Vulkan.SaveValidationCache
Change 3995610 by Uriel.Doyon
Texture Streaming Changes and Fixes:
- Using the small FOV items (like scopes) now only affect visible primitives (through "r.Streaming.MaxHiddenPrimitiveViewBoost").
- Static components added after the level is registered in the streaming manager are now handled correctly (fixes the low quality on the chests)
- Dynamic components do not need to register to the streaming manager anymore.
- Optimized dynamic component management by removing duplicate entries in the update list.
- Added a pregarbage collect pass to the dynamic component management to optimize GC handling.
- Added a budget reset logic whenever the scene requirements change significantly.
- PIE worlds now have correct visibility information.
- Fixed possible invalid memory access when processing the streaming manager slave views.
- Refactored the incremental level texture data build to prevent new components from being unhandled.
- Removed StreamingManager callbacks for NotifyActorSpawned() and NotifyPrimitiveAttached()
- Added a StreamingManager callback NotifyPrimitiveUpdated(), to be used whenever a primitive streaming state must be updated.
#jira none
Change 3995908 by Arciel.Rekman
Fix compile errors when using new Vulkan queries.
Change 3995990 by Arciel.Rekman
More compile fixes to new Vulkan queries.
- MSVC did not catch this, clang did.
Change 3996101 by Rolando.Caloca
DR - vk - Win32 compile fix
Change 3996323 by Mark.Satterthwaite
Use the right include path to export the mtlpp headers.
#jira UE-57507
Change 3996392 by Arciel.Rekman
Vulkan: fix crash on start when using new queries.
- CommandBufferManager was not yet set at that point and the code in queries relied on it.
Change 3996585 by Rolando.Caloca
DR - Slight improvement to GL being black, but just a temporary 'workaround' as it's not correct.
Change 3998806 by Arciel.Rekman
Fix Linux build (UE-57602).
#jira UE-57602
Change 3998866 by Arciel.Rekman
SubwaySequencer: fix old shader platform name.
Change 3998947 by Mark.Satterthwaite
Silence deprecation warnings in CEF on macOS now that we've moved to 10.12 as the minimum.
#jira UE-57577
Change 3998951 by Mark.Satterthwaite
Fix last of the deprecation errors that I am aware of for macOS 10.12.
#jira UE-57581
Change 3998984 by Mark.Satterthwaite
Build mtlpp for iOS 9.0 not 9.3.
#jira UE-57586
Change 3999065 by Rolando.Caloca
DR - vk - Make sure we use version 1.0.0
#jira UE-57521
Change 3999071 by Arne.Schober
DR - [UE-55433, UE-57361] Hack SNORM support in OpenGL by re-interpreting UNORM. Underlying data is always SNORM.
#jira UE-55433, UE-57361
Change 3999494 by Rolando.Caloca
DR - Enable r.UnbindResourcesBetweenDrawsInDX11 in debug
- Clear compute resources when r.UnbindResourcesBetweenDrawsInDX11 is enabled
Change 4000197 by Krzysztof.Narkowicz
Mesh simplifier - normalize TexCoordWeights using min/max TexCoord range. This fixes precision issues for very big TexCoord values and allows to optimize for all TexCoord channels when channels have values of different magnitudes (e.g. non standard TexCoord data).
#jira UE-54935
Change 4000305 by Yuriy.ODonnell
Suppress PVS Studio warning V547 (Expression is always true) related to Aftermath
Reported issue to PVS team and to NVIDIA. Confirmed false positive, fix coming in future PVS version (v6.24).
#jira UE-57579
Change 4000853 by Arciel.Rekman
Linux: fix not calling CrashReportClient (UE-57678).
#jira UE-57678
Change 4001504 by Rolando.Caloca
DR - vk - Fix transition
Change 4002460 by Krzysztof.Narkowicz
Toggle for contant shadow length in word space
Exposed contact shadows to Blueprints
#jira none
Change 4002608 by Rolando.Caloca
DR - vk - Fix static analysis
- Fix potential debug image tracking crash
- Comment out unused methods
Change 4002615 by Rolando.Caloca
DR - vk - Allow r.Vulkan.WaitForIdleOnSubmit to be set at startup (e.g. in ConsoleVariables.ini)
Previously, if your map needed to UpdateSkyCaptureContents on startup, an ensure would fail if GWaitForIdleOnSubmit was set.
PrepareForCPURead needs to wait for the command buffer to finish before trying to read the results back, but the wait has already happened when r.Vulkan.WaitForIdleOnSubmit is set. Trying to wait again correctly complains that the command buffer is not in the correct state. So, skip the WaitForCmdBuffer call when r.Vulkan.WaitForIdleOnSubmit is set.
Change 4002640 by Rolando.Caloca
DR - vk - Missing support for CVarDefaultBackBufferPixelFormat
Change 4002919 by Guillaume.Abadie
Implements DOF's temporal upsampling pass for better dynamic resolution stability.
Change 4002984 by Guillaume.Abadie
Integrates Sebastian Aaltonen's ALU optimisations for TAAU.
Change 4003112 by Olaf.Piesche
Fir for TBB stall (resulting in severe hitches and hangs in the editor with stats active); tested multiple scenarios and encountered no hitches.
#tests QAGame PerformanceTest and RenderTest map with various stats on and off
Change 4003159 by Mark.Satterthwaite
Undo parts of changelist 3970553 - the ref-counted pointer approach to returning textures to the pool is not working as expected so we'll remove that. It'll be faster on the CPU without it and everything works thanks to the changes this CL made to the way textures were released.
#jira UE-57538
Change 4003287 by zachary.wilson
Adding reflection capture content to TM-LightingScenarios
Change 4003395 by Arne.Schober
DR - Fix unitzialised value when clicking Go To in the editor
#jira UE-57048
Change 4003425 by Rolando.Caloca
DR - vk - Fix for new occlusion queries
Change 4003530 by Arne.Schober
DR - Disable GPU Benchmark in headless configurations
#jira UE-57673
Change 4003717 by Rolando.Caloca
DR - vk - Fix for depth not store, stencil store
Change 4003719 by Rolando.Caloca
DR - Minor switch to render pass
Change 4003720 by Mark.Satterthwaite
Don't suballocate private memory buffers on Vega and only Vega as there is something wrong with the blits in those cases but I can't capture a GPU trace to find out what right now (the driver is broken) - could be a bug in my code but this works on Polaris and Nvidia so it will need to be filed as a radar for AMD.
Remove the FMetalBufferChunk from FMetalBuffer and simply store a pointer to the owning Heap/Magazine allocator. The FMetalResourceHeap now calls a new Release function to return the buffer to the allocator which will be faster on the CPU.
#jira UE-57659
Change 4003854 by Mark.Satterthwaite
Undo parts of 3990064 and try a different approach to get the uniforms to upload and remain available in the right places. As the original bug has been lost to time we should keep an eye out for missing buffer bindings by running under the Metal validation layer periodically.
#jira UE-57576
Change 4004709 by Rolando.Caloca
DR - Support for D3D 11, 12 & Vulkan for UAVs off Index Buffers
Change 4005149 by Guillaume.Abadie
Adds shader permutation to avoid clamping input buffer UV in DOF's gather pass.
Change 4005284 by Uriel.Doyon
Resaved volume texture assets with proper engine version.
#jira UE-57534
Change 4005286 by Guillaume.Abadie
Reduces constant setup in DOF's gather pass.
Change 4005359 by Rolando.Caloca
DR - vk - Fix annoying warning
Change 4005363 by Rolando.Caloca
DR - Fix android not finding vulkan shaders
Change 4005457 by Rolando.Caloca
DR - vk - Fix swapchain crash
Change 4005473 by Patrick.Kelly
UE-57135: Editor crash if set Reflection Capture Resolution to be 64 and New a Default level
Codde by Daniel
Tested by Patrick
Change 4005474 by Rolando.Caloca
DR - vk - Remove glsl code from shaders. Packaged QAGame goes from 176MB to 162MB
Change 4005759 by Krzysztof.Narkowicz
Fixed a bug, where reflection capture build is called, even though we are in mobile preview mode.
#jira UE-57743
Change 4005774 by Mark.Satterthwaite
Update the wave intrinsics to avoid implicit bool->uint conversion that Apple don't like.
#jira UE-57750
Change 4005974 by Mark.Satterthwaite
Don't use cubemap array types on iOS Metal as they aren't available on all devices and we need to maintain backward compatibiliy for years to come.
#jira UE-57083
Change 4006056 by Mark.Satterthwaite
Remove the use of the PrimitiveType argument from Metal draw calls.
#jira UE-57822
Change 4006139 by Mark.Satterthwaite
- Move the render-pass functions into the MetalRHI implementation for later alteration.
- Implement Index buffer UAVs for Metal - makes them more like vertex-buffers so this is one more step on the road to a unified buffer base-class implementation.
Change 4006215 by Mark.Satterthwaite
Metal's begin & end render/compute pass API implementation will take some time, but for now make it not depend on the parent stub implementation.
Change 4006394 by Mark.Satterthwaite
In lieu of a real instruction count just use the number of lines in the "Main" function of the shader as the instruction count for Metal.
#jira UE-57551
Change 4006493 by Mark.Satterthwaite
MetalRHI can currently support 4-component formats for Buffer UAVs - this might need some thought in the future as the API evolves but we might as well take advantage while we can.
Change 4006495 by Daniel.Wright
Integrate from Refactor branch
* New FMaterialRenderProxy function GetMaterialWithFallback which provides both the FMaterialRenderProxy and FMaterial. Needed when falling back to default material, so that proxy and material resource match.
* Local vertex factory uniform buffer
Change 4006851 by Brian.Karis
Fix for joined charts forming an L to inflate both axii.
Thanks to Jess Kube of The Coalition.
Change 4006852 by Brian.Karis
Fix for hard coded reflection capture cube map size. Should fix light static light aliasing in captures
Change 4006918 by Brian.Karis
New ByteBuffer functionality. Memcpy and scatter upload. Can implement GPU side TArray reflection.
Not yet used by checked in code. WIP optimization.
Change 4007246 by Guillaume.Abadie
Creates lower quality permutation for DOF's gathering pass, without Coc based weighting of the samples, and lower number of gathering ring for fast accumulator.
Change 4007291 by Guillaume.Abadie
Exposes more DOF scalability settings.
Change 4007328 by Guillaume.Abadie
Optimises DOF's half res only setup pass using gather4
Change 4007627 by Richard.Wallis
Fix for when Magic Mouse cannot zoom in World Composition editor. Missing default SNodePanel::OnMouseMove behaviour. Tested using a classic 2xbutton + wheel mouse and a Mac MagicMouse.
#jira UE-57030
Change 4007682 by Richard.Wallis
No video when playing HLS streaming video on Mac. 2 Issues, FPS was zero making duration for video sample buffer nonsense and Video Track dimensions were going to zero on the AVAsset once fully initialized when playing HSL streams. Now cache relevant details and handle zero frame rate.
Notes:
- Caching the frame rate is not as important as we could look it up each time and fix for zero - ignoring that at the moment.
- Assume we DO NOT want the FrameSize to be the last fetched video frame size from the AvfMediaVideoSampler as I think that is the video quality for streaming video and not the media frame size.
- Renamed a variable in the AvfMediaVideoSample - was called FrameRate but it was the FrameDuration by that point.
#jira UE-56734
Change 4007731 by Rolando.Caloca
DR - Disable byte buffers on non-hlsl based platforms
#jira UE-57851
Change 4007741 by Rolando.Caloca
DR - Disable byte buffers on hlslcc platforms
Change 4007782 by Mark.Satterthwaite
Force Metal shaders, including the stdlib, to recompile.
Change 4007918 by Rolando.Caloca
DR - vk - Some static asserts
Change 4008404 by Arciel.Rekman
Do not crash on incompatible Vulkan drivers (UE-57521).
#jira UE-57521
Change 4008442 by Daniel.Wright
Better comments on ERHIFeatureLevel expectations
Change 4008494 by Arne.Schober
DR - moved bDeletedThroughDeferredCleanup before begincleanup to catch cases where the reference is added twice to the array. also removed finishcleanup as all they ever did was deleting the pointer anyway, and it sould be adfded if such functionallity is ever required fom outside of the regular destructor.
#jira UE-57754
Change 4008730 by Mark.Satterthwaite
After the most recent changes to handling uniform buffer dirty bits in MetalRHI we should guard against attempts to set an unbound uniform buffer.
#jira UE-57870
Change 4008949 by Brian.Karis
Fix compile warning
Change 4008951 by Brian.Karis
Added LTC LUT textures
Change 4009326 by Guillaume.Abadie
Compiles out DOF's gathering bokeh simulation on platform other than desktop.
Change 4009380 by Krzysztof.Narkowicz
Moved area light code before the contact shadows, so contact shadows use representative light's direction.
Merged all contact shadows shader code.
Contact shadows keep constant screen space length independent of FoV settings.
Contact shadows for translucents.
Contact shadows for eye.
Change 4009555 by Guillaume.Abadie
Splits DOFCocTile.usf in two.
Change 4009999 by Yuriy.ODonnell
MallocStomp can now be enabled on certain platforms using '-stompmalloc' command line argument.
Previously it was necessary to modify MallocaStomp.h and re-compile the engine.
Currently supported platforms: Win64, Mac, Linux.
Replaced hard-coded page size with FPlatformMemory::GetConstants().PageSize.
Change 4010288 by Rolando.Caloca
DR - vk - Fix for vertex streams
Change 4010289 by Krzysztof.Narkowicz
D3D12 - fixed depth bounds bug, where depth bounds wasn't properly set to [0;1] after disabling.
#jira UE-57510
Change 4010297 by Rolando.Caloca
DR - vk - Remove some functions for android
Change 4010315 by Rolando.Caloca
DR - vk - Remove create info macro
Change 4010451 by Rolando.Caloca
DR - vk - Reuse samplers
- Infiltrator goes from 5759 to 24 samplers!
Change 4010627 by Rolando.Caloca
DR - vk - Fix missing values for tracking swapchain validation
Change 4011924 by Guillaume.Abadie
Implements tile based early return optimisation on DOF's postfiltering method.
Change 4011941 by Guillaume.Abadie
Shaves some ALU in DOF's accumulator for LowQuality permutation.
Change 4012093 by Yuriy.ODonnell
Disable MallocStompOverrunTest() in static analysis config, as it intentionally performs an out-of-bounds access.
Change 4012195 by Rolando.Caloca
DR - vk - Fix for mobile backbuffer layout
Change 4012202 by Rolando.Caloca
DR - vk - Don't use staging buffers on UMA
Change 4012467 by Rolando.Caloca
DR - Remove redundant check
Change 4012486 by Rolando.Caloca
DR - Fix missing transition
Change 4012518 by Guillaume.Abadie
Implements fast shader permutation for DOF's TAA pass.
Change 4013084 by Arciel.Rekman
Fix for Linux clock discrepancy.
- Causing at least one precision issue, possibly more.
(Edigrating 4003273, 4012462 from //UE4/Dev-Editor/... to //UE4/Dev-Rendering/...)
Change 4013266 by Uriel.Doyon
Fixed crash when setting SceneDepthTextureNonMS and not having valid depth buffers in the SceneContext.
Change 4013626 by Uriel.Doyon
Fixed crash in the lighting build when creating a blueprint of the ALight and placing a light component in it.
#jira UE-51672
Change 4013805 by Rolando.Caloca
DR - Fix more missing transitions
Change 4014128 by Arne.Schober
DR - Do not create LocalVFUniformBuffer when running without MVF
#jira UE-57929
Change 4014193 by Uriel.Doyon
Editing component transforms now invalidate the component's lighting cache.
#jira UE-48134
Change 4014282 by Rolando.Caloca
DR - vk - Remove extra validation during dump
Change 4014584 by Uriel.Doyon
Duplicated static meshes now generate a new GUID to prevent possible issues with lightmass.
#jira UE-49064
Change 4014604 by Uriel.Doyon
UStaticMesh postduplicate now only generates a new GUID if !bDuplicateForPIE.
Change 4015460 by Guillaume.Abadie
Composes separate translucency within DOF's recombine pass.
Change 4015571 by Guillaume.Abadie
Refactors tonemapper to use global shader permutation API, that adds permutation for HDR output device rather than dynamic branching that some shader compiler are not very well optimizing.
Change 4015984 by Krzysztof.Narkowicz
Fixed crash inside DFAO resource allocation, when DFAO viewport has zero area.
#jira UE-58000
Change 4016056 by Mark.Satterthwaite
Fix Mac Metal shader compilation of texture cube arrays.
Change 4016062 by Richard.Wallis
Convert things like Space, Delete, F6 etc to unicode so they display correctly on the Mac menu rather than first letter of word. Added the default Mac commands to the GenericCommands so we get a Chord overwrite message and stop things like cmd+ q / w / h from getting bound.
#jira UE-46999
Change 4016109 by Mark.Satterthwaite
One unified Metal buffer implementation - will make further changes a heck of a lot easier.
Change 4016221 by Patrick.Kelly
UE-57617: Ensure changing viewmode to ShaderComplexity while in -game
Change 4016238 by Guillaume.Abadie
Makes clang happy again in Tonemapper.
Change 4016309 by Mark.Satterthwaite
More *_RenderThread implementations for MetalRHI.
Change 4016414 by Mark.Satterthwaite
And MetalRHI version of CreateStructuredBuffer_RenderThread...
Change 4016498 by Mark.Satterthwaite
Don't hold on to the uniform buffers bound to the hull shader when switching to a tessellated draw call as they'll have the wrong buffer layout.
#jira UE-57930
Change 4017394 by Juan.Canada
OpenGL: Fixed shading artifacts due incorrect UNORM/SNORM conversions in skin/skincache/computetangent shaderss.
#jira UE-57691
Change 4017522 by Rolando.Caloca
DR - vk - Remove unused code path (old mip generation detection)
Change 4017539 by Rolando.Caloca
DR - vk - Fix for sky lighting mips showing green on AMD
Change 4017542 by Arciel.Rekman
Moved appCountTrailingZeros to a non-SSE header (fixes ARM64 build).
- Arguably WITH_SLI shouldn't apply to Linux on ARM but the fact that the function wasn't available is bad on its own.
Change 4017827 by Guillaume.Abadie
Optimises DOF's scattering cost by a third.
Change 4017835 by Rolando.Caloca
DR - Only allow a render pass to generate mips for one color render target
Change 4017889 by Mark.Satterthwaite
Cache all the Metal state objects to avoid hitting the API unnecessarily.
Change 4018251 by Mark.Satterthwaite
Fix broken rendering on Metal that tracked back to the innocuous looking changes in CL #4006495 (no blame attached - these changes are entirely reasonable) and cause various bugs in QAGame's TM-DistanceFields, ElementalDemo and probably more. Doesn't fix broken SpeedTree rendering :(.
MetalRHI was allowing uniform buffers to blow away linear texture buffers when the constant buffer has been elided due to dead-code elimination. This problem can manifest without linear textures if the uniform buffer contains both constant data and a resource-table but the shader doesn't use any of the constant data. That's because Metal doesn't separate constant buffers from any other kind of buffer unlike D3D which separates all the slots out - and Metal doesn't provide enough buffers to emulate the D3D arrangement. So far this has only manifested in the MVF + Linear Texture case but a more robust solution will be necessary long term.
Change 4018514 by Guillaume.Abadie
Implements r.DOF.Scatter.MinCocRadius.
Change 4018553 by Guillaume.Abadie
Implements r.DOF.Scatter.MaxSpriteRatio to control the budget upperbound of DOF's scattering
Change 4020369 by Yuriy.ODonnell
Disable MallocStompOverrunTest in all static analysis configs (using USING_CODE_ANALYSIS macro)
Previously was only disabled for PVS-Studio.
Change 4020620 by Arciel.Rekman
Fix XboxOne CIS (fallout of appCountTrailingZeros move).
Change 4020949 by Guillaume.Abadie
Configures DOF in scalability settings.
Change 4021593 by Rolando.Caloca
DR - vk - Support for Aftermath style api on AMD
Change 4021740 by Rolando.Caloca
DR - vk - Change log output
Change 4022008 by Uriel.Doyon
Fixed renderthread stalls when streaming texture mips on low end systems.
Change 4022135 by Rolando.Caloca
DR - vk - Fix last mip's layout during mip chain creation
Change 4022607 by Jian.Ru
Speculative fix for a bug where an invalid vertex buffer is deferenced
#jira UE-56229
Change 4022890 by Rolando.Caloca
DR - Fix reference count not getting released
Change 4023540 by Mark.Satterthwaite
Avoid some pointless retain/release calls on Metal Encoders.
Change 4023796 by Marcus.Wassmer
Tell users they are over the maximum size when allocating very large rendertargets.
Change 4025337 by Yuriy.ODonnell
Improved use-after-free detection mechanism and physical memory usage of MallocStomp on Windows.
MallocStomp on Windows will now reserve virtual address space for every allocation and then commit physical pages only to the valid usable part.
Physical pages will be unmapped on Free, but virtual address space will not be released and therefore will never be re-used.
Virtual address space is allocated from the OS in blocks of 1GB and then linearly sub-allocated.
This reduces VA space usage, as VirtualAlloc returns blocks on 64KB granularity even if we just need 4KB. As a small bonus, this also reduces number of syscalls per allocation.
This dramatically increases accuracy of use-after-free detection, but consumes significant amount of memory for the OS page table.
Virtual memory limit for a process on Win10 is 128 TB, which means we can afford to keep virtual memory reserved for a long time.
Running Infiltrator demo consumes ~700MB of virtual address space per second.
Additionally, committing physical pages only for the usable part of the entire virtual block reduces physical memory usage by ~30% compared to old behavior,
which allocated and committed entire block of pages via BinnedAllocFromOS and then marks border page as non-accessible.
Change 4026047 by Rolando.Caloca
DR - Fix test/shipping
#jira UE-58148
Change 4026150 by Krzysztof.Narkowicz
Force proper ordering of buffer visualization materials - after tonemapping (so exposure doesn't influence it) and before editor stuff like icons.
#jira UE-57992
Change 4026226 by Rolando.Caloca
DR - Fix static analysis
#jira UE-58150
Change 4026354 by Jian.Ru
Debug check trying to catch a crash. Only enabled in editor build
#jira UE-50111
Change 4026655 by Rolando.Caloca
DR - Fix for static analysis
#jira UE-58149
Change 4026763 by Rolando.Caloca
DR - Remove references to defunct CCT to avoid confusing licensees
Change 4027167 by Uriel.Doyon
Fixed possible out of bound buffer access when serializing with FDuplicateDataWriter.
#jira UE-56509
Change 4027850 by Jian.Ru
Prevent log spam
#jira UE-50111
Change 4029546 by Rolando.Caloca
DR - Compile fixes
Change 4029624 by Yuriy.ODonnell
Addressed static analysis errors in MallocStomp
- VirtualAlloc return value is now explicitly checked.
- C6250 is suppressed, as VirtualFree does not release address space by design.
Change 4030225 by Yuriy.ODonnell
Static analysis warning fix: make sure declaration of Sleep() is consistent between Windows headers and TBB
The complexity with this particular case is that the warning is generated in synchapi.h, which is included by some Windows headers.
If a module includes TBB and then Windows platform headers, static analyzer will report this warning.
Suppressing it would require wrapping all instances of Windows header includes in third-party macros.
Current pragmatic solution is to modify the Sleep() declaration in TBB header to be consistent with Windows and to report the issue to Intel for a permanent fix.
Change 4030440 by Rolando.Caloca
DR - Fix crash on mobile
#jira UE-58222
Change 4030570 by Daniel.Wright
Allow null SRV's in uniform buffers for feature levels that don't support SRV's in shaders
Change 4030618 by Arne.Schober
DR - missing tangent/normal sign conversion after integration from main
#jira UE-58224
Change 4031588 by Rolando.Caloca
DR - vk - Fix compile error when missing vkCmdWriteBufferMarkerAMD
Change 4032145 by Mark.Satterthwaite
Fix UE-58268 by only emitting the base_instance/base_vertex variables required to fix-up the instance/vertex ID values to match D3D when the Metal version is 1.1 or higher, earlier versions don't support these features.
#jira UE-58268
Change 4032209 by Rolando.Caloca
DR - Fix crash on EngineTest: Mesh Batch's UserIndex is not a union anymore
Change 4033178 by Guillaume.Abadie
Fixes FXAA sampling outside viewports, that was causing black outline on bottom and right edge of the screen when ViewSize != BufferSize, problematic for some screenshot automated test.
#jira UE-58151
Change 4034489 by Daniel.Wright
Fixed UStaticMeshComponent modifying its UStaticMesh when undoing a change. This caused a crash when other static mesh components using the same mesh asset were rendered, since their rendering state was not recreated. A component should not modify its asset during PostEditUndo.
* This behavior has been present for a long time but was previously hidden because only the vertex factory of the mesh asset is cached in static draw lists, not any of its rendering resources (eg vertex declaration).
Change 4035157 by Uriel.Doyon
Fixed deadlock in the streaming code when running with -onethread.
#jira UE-58299
Change 4035198 by Rolando.Caloca
DR - vk - Fix issue when an older SDK was installed, UBT would pick it (should pick the newer of ThirdParty\Vulkan or installed SDK).
#jira UE-58267
Change 4035730 by Arne.Schober
DR - Fix missing Fog parameters during LightScattering Injection
#jira UE-57608
Change 4035843 by Daniel.Wright
Reimplemented support for EyeAdaptation node in opaque materials
Change 4036837 by Marcus.Wassmer
Replace some of the screenshots to match new un-tonemapped buffer visualization
Change 4036980 by Rolando.Caloca
DR - vk - Fix deadlock contention during mem allocation on Linux
Change 4037225 by Guillaume.Abadie
Fixes jittering selection outline.
#jira UE-58350
Change 4038056 by Marcus.Wassmer
roll back changelist 4026150. breaks a bunch of automated tests by cutting off half the image.
Change can go back in later with that part fixed also
Change 4038296 by Jian.Ru
Static analysis fix
#jira UE-58377
Change 4038402 by Ben.Marsh
Suppress IncludeTool warnings caused by CL 3998947.
Change 4038514 by Arne.Schober
DR - Fix case with MVF where instance offset is not supported by the API (in this case only foliage OpenGL and TvOS), usually the buffers are offsetted instead but with MVF we do not use offsetted buffers, therfore the offset needs to be passed into the shader although we are drawing with offset of 0.
#jira UE-57652
Change 4038747 by Marcus.Wassmer
Back out changelist 3853645, causing us to lose shadows in the shaderhair test
Change 4040138 by Rolando.Caloca
DR - Fix compile warning
Change 4041614 by Rolando.Caloca
DR - vk - Fix for Oculus module
#jira UE-58267
Change 3810277 by Daniel.Wright
Ray Traced Distance Field shadows use a two pass tile culling algorithm with no tile max - fixes flickering from tile overflow in dense areas or with a low sun angle. Costs .2ms on PS4.
The distance field scene buffers now use float4 on PS4 and Xbox, saves .1ms on PS4.
Change 3817029 by Uriel.Doyon
Added UVolumeTexture, which use 3D textures. Compressed formats are supported on DX11, DX12, PS4 and XB1.
Projects targetting OpengGL don't have access to compressed formats (as the implementation has texture tiling issues).
Add "r.AllowVolumeTextureAssetCreation" set as 0 by default, which controls whether volume texture can be sampled in materials and whether they can be created from 2D texture assets.
Platform not supporting BC7, will now fallback on RGBA8 instead of DXT to preserve quality, in an attemps to increase usage of BC7.
#jira UE-32263
Change 3819960 by Michael.Lentine
Expose UEPhysics Clothing Parameters through UI.
Change 3823401 by Rolando.Caloca
DR - Add NumQueriesInBatch to RHIBeginOcclusionQueryBatch
Change 3844805 by Arne.Schober
DR - Increased Intermediate normal of Umodel and Skelmesh from 8bit Unorm Compressed to float. A resave/rebuid/reimport of the meshes is recommended to recover some lost precision.
Fixed an issue with compressed (packed) normals on the GPU which were off by one integer representation. Also switched from UNORM to SNORM to get a discrete zero representation and removed some mads from all the VertexShaders.
Change 3847283 by Marcus.Wassmer
Extra fixes from Uriel
Change 3876607 by Rolando.Caloca
DR - Use render passes when running occlusion queries
- Removes the RHI(Begin|End)OcclusionQueryBatch API
Change 3903799 by Daniel.Wright
[Integrate] Pass Uniform Buffers
* All pass-constant shader inputs should go into the appropriate pass uniform buffer, instead of being set per-draw
* Moved many per-draw base pass parameters over to the Base Pass Uniform Buffer
* Opaque and Translucent base pass shaders have different uniform buffers, which allows compile errors when accessing an invalid resource (eg GBuffer in Opaque), instead of silently falling back to GBlackTexture
Uniform buffers can now contain nested structs with UNIFORM_MEMBER_STRUCT()
* This allows composing a uniform buffer at a particular update frequency out of many features, with encapsulation of each feature's parameters in a struct.
* Eg deferred fog uses FFogUniformParameters, but so does translucency in the base pass, where FFogUniformParameters is reused nested inside the base pass uniform buffer.
* Resources can now be located anywhere in the uniform buffer. Padding is inserted to the cbuffer representation to keep memory layouts matching. In the future the cbuffer could be compacted.
* RemoveUniformBuffersFromSource() which works around HLSLCC lack of struct initializers now handles nested structs
Change 3917500 by Rolando.Caloca
DR - Change depth bounds so only the enable bit is in the PSO, allow min/max to be dynamically modified
Change 3964907 by Guillaume.Abadie
Implements RectList topology support in RHI.
Change 3979171 by Mark.Satterthwaite
Copying //Tasks/UE4/Dev-UERNDR-354-mtlpp to Dev-Rendering (//UE4/Dev-Rendering):
Rewrites MetalRHI in terms of mtlpp, which is a C++ wrapper library built around Metal's Objective-C API that attempts to reduce overheads and eliminate resource lifetime errors.
Regarding mtlpp:
- The mtlpp library uses C++ constructor/destructor and smart-pointer style management of Objective-C retain/release calls to prevent over- and under-release problems.
- To reduce Objective-C overheads the mtlpp library caches the internal C-function that implements the Objective-C selectors for the most commonly used Metal protocol types and calls the function directly - this avoids objc_msgSend which does this look-up dynamically and thus improves CPU performance slightly.
- Another advantage is that mtlpp provides infrastructure to extend the Metal API slightly to help improve MetalRHI - the two important aspects are mtlpp::CommandBufferFence which provides a consistent CPU<->GPU synchronisation primitive and sub-buffer allocations from mtlpp::Buffer which allow for far superior memory management.
- Validation functionality is also provided by mtlpp to detect CPU vs. GPU data races and resource lifetime validation - this is expensive and is thus optional and compiled out from Shipping binaries that should be used when performance is most critical. The validation only works between resource modification and *submitted* command-buffers - anything that is being actively encoded on the CPU is ignored and it remains the responsibility of the application to validate the order of operations when encoding.
Apple Platform:
- LLM support which tracks Objective-C objects is enabled only on macOS - we don't have the necessary libraries to intercept and override the internal system calls on iOS.
MetalRHI:
- All the types are switched over, (mostly) insuling the external API from the horror of Metal and Objective-C.
- Buffers are now managed quite differently, small buffers are allocated from a magazine allocator that allocates in fixed blocks from a larger parent buffer, intermediate sized buffers are allocated from a simple heap allocator that wraps a larger buffer and anything of reasonable size (>2Mb) will use the pooled allocator. This *radically* reduces the number of buffer resources, by as much as a factor of 10, because they are now sub-allocated without the need to use MTLHeap or MTLFence so they are performance equivalent to the existing implementation on the GPU and much faster on the CPU. Total memory use is approximately the same.
- Vertex & index buffer management has been updated to reflect changes in the management and to avoid reallocating buffers which provide a Linear Texture (for SRVs) unless strictly necessary. This ensures that even in cases where a dynamic buffer is updated multiple times in a frame it will still work acceptably well.
- The Metal ring-buffer implementation is completely different again, this time it can use Managed memory on macOS which allows for much better performance on eGPUs which will be more and more important for Mac.
- Everyone that needs to wait on a command-buffer fence (rather than a command-buffer itself) now use mtlpp::CommandBufferFence, which prevents race conditions between the different command-buffer handlers (which sometimes execute out of order).
- LLM tracking should now report the same data as the MetalRHI stats group for buffer & texture allocations - there is no segmentation for Vertex/index/Structured/Uniform allocations in Metal so these numbers are going to be wrong and will need to be rethought.
- What will be unseen are the number of small but important resource usage fixes that avoid stale resources from being bound to the device after the point at which they become invalid. This should eliminate a class of errors where the GPU uses a resource pointer that is modified by the CPU and was necessary to satisfy the new mtlpp validation code.
Other:
- Remove the Metal focused workarounds from the ClothBuffer resource binding and related vertex-buffer SRV - these were put in when MetalRHI/MetalShaderFormat couldn't handle float->uint conversions correctly and they should now.
- Fix a validation error caused by trying to render a 0-sized scissor rect which is invalid in Metal and simply pointless elsewhere.
- Consistency of disabling the Manual Vertex Fetch behaviour in shaders.
#jira UERNDR-354
Change 3979312 by Rolando.Caloca
DR - Remove bogus bKeepOriginalSurface parameter in CopyToResolveTarget
Change 4005122 by Rolando.Caloca
DR - Support for PS4 Index Buffer UAVs
Change 4016298 by Guillaume.Abadie
Fixes DOF hybrid scattering on platforms that supports RectList topology.
Change 4018575 by Guillaume.Abadie
Optimises DOF's reduce pass when doing scattering compilation.
Change 4020317 by Guillaume.Abadie
Implements WaveBroadcastIntrinsics.ush.
[CL 4042226 by Marcus Wassmer in Main branch]
2018-05-01 10:36:33 -04:00
static void ComputeUpdateRegionsAndUpdateViewState (
2016-04-13 21:24:38 -04:00
FRHICommandListImmediate & RHICmdList ,
2020-07-06 18:58:26 -04:00
FViewInfo & View ,
2016-04-13 21:24:38 -04:00
const FScene * Scene ,
FGlobalDistanceFieldInfo & GlobalDistanceFieldInfo ,
int32 NumClipmaps ,
2021-02-04 15:30:42 -04:00
float MaxOcclusionDistance ,
bool bLumenEnabled )
2015-05-11 20:04:15 -04:00
{
GlobalDistanceFieldInfo . Clipmaps . AddZeroed ( NumClipmaps ) ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
GlobalDistanceFieldInfo . MostlyStaticClipmaps . AddZeroed ( NumClipmaps ) ;
2015-05-11 20:04:15 -04:00
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
// Cache the heightfields update region boxes for fast reuse for each clip region.
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3219450)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3148067 on 2016/10/01 by Daniel.Wright
Support for ReflectionEnvironment and light type show flags with ForwardShading
Change 3149085 on 2016/10/03 by Daniel.Wright
Support for ReflectionEnvironment show flag in base pass reflections without any shader overhead
Change 3162206 on 2016/10/13 by Chris.Bunner
Merging Dev-MaterialLayers to Dev-Rendering, CL 3161593:
Material expressions; Trig, fast-trig, saturate, round, truncate, pre-skinned normal.
Added CustomEyeTangent to material attributes.
Resolved some hard-coded attribute typing and other minor fixes.
Change 3186067 on 2016/11/03 by Daniel.Wright
Updated Stationary primitive tooltip to indicate that it allows the primitive to be changed, but not moved
Change 3186069 on 2016/11/03 by Daniel.Wright
Using a weighted geometric mean to combine multiple Distance Field Indirect Shadows, greatly reduces over-occlusion when overlap is high
Change 3186084 on 2016/11/03 by Mark.Satterthwaite
Duplicate 3172511:
Don't set Metal resource option fields on texture descriptors when running on an OS that doesn't support them.
#jira UE-37481
Change 3186089 on 2016/11/03 by Mark.Satterthwaite
Duplicate CL #3169764:
Fixed automatic conversion of G8_sRGB into RGBA8_sRGB required for Mac Metal, which fixes FORT-27627.
#jira FORT-27627
Change 3186113 on 2016/11/03 by Mark.Satterthwaite
Duplicate CL #3183807:
Change the way we access the Metal viewport's backbuffer, to reduce possible causes of FORT-31649:
- Added console variable "rhi.Metal.SupportsIntermediateBackBuffer" to control whether to use an extra render-target so we can support screenshots & movie capture, or render directly to the back-buffer to save memory & GPU performance. Still defaults to ON for Mac & OFF for iOS/tvOS.
- Change the way we handle updates to the back-buffer size to ensure that the different threads access their intended version.
#jira FORT-31649
Change 3186116 on 2016/11/03 by Mark.Satterthwaite
Duplicate CL #3183823:
Record Metal resource & state objects used in a command-buffer when rhi.Metal.RuntimeDebugLevel is set to 3 or higher. The object labels, types & descriptions will be printed on failure - if the object is deleted prior to this then we have a lifetime error and it will crash at this point and can be debugged further using our -metalretainrefs command-line option or Xcode's zombie-objects.
Used to verify that FORT-31649 is not a simple resource lifetime error and thereby speed up Apple/vendor investigations.
#jira FORT-31649
Change 3186818 on 2016/11/04 by Chris.Bunner
PR #2907 Export UMaterialExpressionNoise (contributed by kayosiii).
Change 3186979 on 2016/11/04 by Rolando.Caloca
DR - Misc minor cleanup
Change 3187169 on 2016/11/04 by Uriel.Doyon
Incremental insertion of level data between PostLoad and AddToWorld
Change 3187205 on 2016/11/04 by Mark.Satterthwaite
Compile fixes for iOS.
Change 3187389 on 2016/11/04 by Uriel.Doyon
Fix for possible stall when loading hidden level
Change 3187598 on 2016/11/04 by Michael.Trepka
MetalViewport compile fix
Change 3187678 on 2016/11/04 by Uriel.Doyon
Fix for landscape grass textures not being streamed in correctly.
Change 3187731 on 2016/11/04 by Rolando.Caloca
DR - Start making type safe some cross compiler enums
Change 3187824 on 2016/11/04 by Rolando.Caloca
DR - clang compile fix
Change 3187953 on 2016/11/04 by Rolando.Caloca
DR - vk - Mac compile fix
Change 3188696 on 2016/11/07 by Mark.Satterthwaite
Another iOS compile fix for new MetalViewport validation code.
Change 3188906 on 2016/11/07 by Rolando.Caloca
DR - Show permutation of LUTBlender
Change 3189094 on 2016/11/07 by Chris.Bunner
Fix RemoveAAJitter from projection matrix.
#jira UE-37701, UE-38003
Change 3189134 on 2016/11/07 by Daniel.Wright
Fix for CreateRenderTarget2D called in construction script during cooking
Change 3189145 on 2016/11/07 by Chris.Bunner
Follow-up to CL 3186818, export UMaterialExpressionVectorNoise.
Change 3189239 on 2016/11/07 by Daniel.Wright
Added show flag for Contact Shadows, disabled in planar reflections
Change 3189252 on 2016/11/07 by Daniel.Wright
Support for Reflection Capture intensity with simple reflections, which are the default with Forward Shading
Change 3189406 on 2016/11/07 by Mark.Satterthwaite
Really fix the last of the iOS compile errors from changes to the MetalViewport code.
Change 3190854 on 2016/11/08 by Ben.Woodhouse
XB1: Fix memory corruption with RHICreateVertexBuffer and RHICreateIndexBuffer when using initial data (Procedural Mesh Component crash)
#jira UE-34264
#fyi james.golding
#fyi keith.judge
Change 3190962 on 2016/11/08 by Olaf.Piesche
Unshelved from pending changelist '3176615' - Gil's fix for race condiiton with particle vertex factory reuse across different passes; potential to fix a number of issues
Change 3191959 on 2016/11/09 by Uriel.Doyon
Removed some static primitives from the dynamic primitive handler for texture streaming.
Change 3193122 on 2016/11/10 by Chris.Bunner
Always update non-preview material resources for use in code preview.
#jira UE-38223
Change 3193190 on 2016/11/10 by Gil.Gribb
UE4 - Fixed rare bug with shadow groups rendering things that have not been setup to render this frame.
#jira UE-36379
Change 3193523 on 2016/11/10 by Uriel.Doyon
Fixed incorrect section bounds used for texture streaming.
Change 3193962 on 2016/11/10 by Uriel.Doyon
Added defrag of dynamic bounds used for the texture streaming. Allows to remove unused bounds over time.
Change 3193974 on 2016/11/10 by Uriel.Doyon
New "Required Texture Resolution" view mode. Showing the ratio between the currently streamed texture resolution and the resolution wanted by the GPU.
Change 3194109 on 2016/11/10 by Uriel.Doyon
Another patch on material bounds used for texture streaming.
Change 3194665 on 2016/11/11 by Chris.Bunner
Duplicated behavior for inherited velocity scaling scaling to vert/surface spawned particles.
Change 3194734 on 2016/11/11 by Rolando.Caloca
DR - vk - Simplified some texture casting
Change 3194867 on 2016/11/11 by Rolando.Caloca
DR - vk - SM5 fixes
Change 3195176 on 2016/11/11 by Chris.Bunner
Fixed incorrectly updated NVAPI error.
Change 3195425 on 2016/11/11 by Uriel.Doyon
Fixed possible invalid level reference in the texture streamer
Change 3196512 on 2016/11/14 by Gil.Gribb
Merging //UE4/Dev-Main@3196156 to Dev-Rendering (//UE4/Dev-Rendering)
Change 3196750 on 2016/11/14 by Marcus.Wassmer
Fix ordering problem with GPU cache transitions
Change 3196815 on 2016/11/14 by Daniel.Wright
Suppressed 'Instanced stereo rendering is not supported' warning showing up in CIS
Change 3196818 on 2016/11/14 by Daniel.Wright
Fixed FIndirectLightingCache::UpdateCachePrimitivesInternal churning through a bunch of temporary memory
Change 3196819 on 2016/11/14 by Daniel.Wright
Volume lighting samples are allowed outside of the importance volume if their influence affects the volume. Fixes black indirect lighting on movable components in maps with small importance volumes.
Volume lighting samples placed on surfaces use a radius that covers the layer height spacing, which prevents an uncovered region between layers
Change 3197243 on 2016/11/14 by Uriel.Doyon
Async Task For Updating static component LastRender time
#jira UE-24268
Change 3197359 on 2016/11/14 by Daniel.Wright
Added Inscattering Texture controls to ExponentialHeightFog
* When InscatteringColorCubemap is specified, directional light inscattering is disabled
* Lerps betwen 1x1 mip at NonDirectionalInscatteringColorDistance to mip 0 at FullyDirectionalInscatteringColorDistance
* Added FogCutoffDistance, so artists can prevent fog on skyboxes (requires fog to be setup matching the fog that was rendered into the sky texture so that distant mountains match)
* Fog shader permutations based on what feature is enabled
Change 3198419 on 2016/11/15 by Chris.Bunner
PS4 HDR: Runtime toggle (backbuffer recreation on resize matching), UI composition. Matches PC behavior and controls.
HDR: Generalized buffer formats, cvar consistency pass, LUT for UI composition, refactoring common functions.
Exposed RHICreateTargetableShaderResource3D.
Moved some (translucent) volume rendering helpers to allow access in Slate.
Change 3198822 on 2016/11/15 by Daniel.Wright
Mac compile fix
Change 3199509 on 2016/11/15 by Uriel.Doyon
Added support for viewmode param asset name (and note just param value).
Used to investigate texture streamer behavior.
Change 3199578 on 2016/11/15 by Rolando.Caloca
DR - Add some shader resource tables to SCW when running with -directcompile
Change 3199698 on 2016/11/15 by Rolando.Caloca
DR - vk - Refactor shader & descriptor bindings
Change 3199712 on 2016/11/15 by Rolando.Caloca
DR - vk - r.Vulkan.StripGlsl to always strip glsl at runtime to save memory per shader
Change 3199717 on 2016/11/15 by Rolando.Caloca
DR - vk - Show hitching PSO info again
Change 3199750 on 2016/11/15 by Rolando.Caloca
DR - SCW clang compile fixes
Change 3200353 on 2016/11/16 by Rolando.Caloca
DR - vk - Mac fix
Change 3200358 on 2016/11/16 by Chris.Bunner
Only allow UI composition on platforms we currently use it.
Change 3200823 on 2016/11/16 by Chris.Bunner
Remove expression key attribute ID when not translating an attribute output to allow intended expression sharing.
#jira UE-38699
Change 3200947 on 2016/11/16 by Mark.Satterthwaite
Fix UE-38695 by not trying to resize the viewport on the wrong thread.
#jira UE-38695
Change 3201069 on 2016/11/16 by Daniel.Wright
Fog inscattering texture limited to SM4 and above, fixes ES2 compile errors
Change 3201346 on 2016/11/16 by Brian.Karis
Temporal AA fix for correct edge gradients.
Filtering now combined with importance sampling.
Enabled Catmull-Rom resolve filter. Results are now slightly sharper.
Fixed antighosting. Will yet require a dilation to be perfect.
Optimized bicubic filtering to 5 taps instead of 9.
Cleaned out unused code.
Change 3201369 on 2016/11/16 by Brian.Karis
Bicubic texture sample
Change 3201522 on 2016/11/16 by Rolando.Caloca
DR - vk - Fix static analysis issues
Change 3201878 on 2016/11/17 by Chris.Bunner
Temporarily disable Nvapi HDR error logging.
#jira UE-38529
Change 3202108 on 2016/11/17 by Simon.Tovey
Assets with easy repro for flickering particles bug
Change 3202181 on 2016/11/17 by Rolando.Caloca
DR - vk - CIS android fix
Change 3202325 on 2016/11/17 by Ben.Woodhouse
Integrate 4.14.1 fix from 14 //UE4/Release-4.14 (@3201850)
Fix CreateVertexbuffer and CreateIndexBuffer memory corruption (Procedural Mesh Component crash)
#jira UE-34264
Change 3204394 on 2016/11/18 by Guillaume.Abadie
PR #2808: AlphaComposite Fog Opacity fix (Contributed by moritz-wundke)
#br Ben.Woodhouse
Change 3204428 on 2016/11/18 by Guillaume.Abadie
Fixes a couple of issues in decals:
* Crash in FDecalDrawingPolicyFactory::DrawMesh()
* ActorPostion material expression
* PixelNormalWS material expression
* Missing renaming from DEFERRED_DECAL to DECAL_PRIMITIVE
#jira UE-38327, UE-38158, UE-37818, UE-37350
Change 3204429 on 2016/11/18 by Uriel.Doyon
Darker default undefined accuracy.
Reenabled the texture streaming build in the build all.
Change 3204458 on 2016/11/18 by Chris.Bunner
Shader truncation warnings fix.
Change 3204459 on 2016/11/18 by Chris.Bunner
Engine 'Passthrough' material fuction fix. V4 is now actually a V4.
Change 3204460 on 2016/11/18 by Chris.Bunner
Correctly handle some known Nvapi warnings.
#jira UE-38529
Change 3204653 on 2016/11/18 by Marc.Olano
Helper functions for tiled textures
Checking in for Ryan Brucks
Change 3204863 on 2016/11/18 by Arne.Schober
DR - Replaced ENQUEUE_UNIQUE_RENDER_COMMAND with a Debuggable template Implementation
Change 3204939 on 2016/11/18 by Arne.Schober
DR - Make clang happy
Change 3204968 on 2016/11/18 by Arne.Schober
DR - UE-38494 - Fixed SpeedTree Wind crash, when force deleting the Asset.
Change 3206293 on 2016/11/21 by Uriel.Doyon
New member bHasStreamingUpdatePending in UTexture2D to delay update of global distance fields.
Set to true when the streamer can possibly load a mip in the near future.
#jira UE-37787
Change 3206551 on 2016/11/21 by Chris.Bunner
Added material update context when forcing all shaders to recompile.
#jira UE-38481
Change 3206644 on 2016/11/21 by Benjamin.Hyder
Updating Planar Reflection example in TM-Shadermodels.
Change 3206899 on 2016/11/21 by Rolando.Caloca
DR - vk - SM5 fixes
Change 3206900 on 2016/11/21 by Rolando.Caloca
DR - Added missing strings for shader formats
Change 3206983 on 2016/11/21 by Rolando.Caloca
DR - vk - Support for SV_Coverage
Change 3207237 on 2016/11/22 by Simon.Tovey
Exporting particle module base and a couple of child classes as it's commonly requested.
#test compiles
Change 3207241 on 2016/11/22 by Gil.Gribb
Merging //UE4/Dev-Main@3206998 to Dev-Rendering (//UE4/Dev-Rendering)
Change 3207520 on 2016/11/22 by Ben.Woodhouse
Cherry picked from //Fortnite/Main@3206301
Fixed GPU hang in Zone Map view. Was an issue with RenderThread using the device context without appropriate RHIThread flushes.
#jira FORT-31616
#code_review keith.judge
Change 3207541 on 2016/11/22 by Ben.Woodhouse
Cherry picked from //fortnite/Main@3207422
* Fix UpdateTexture3D to create a staging texture of the region to update rather than the whole texture. This prevents distance fields crashing during update (allocating 18GB per frame in some cases)
* Put UpdateTexture2D DMA support onto a cvar, disabled by default (corruption issues reported by licensees, plus not sure it's actually faster - could be slower due to reduced bandwidth; issues reported by licensees)
* Fix UpdateTexture2D to only create a staging texture of the region to update, saving memory
#jira UE-38609
Change 3207654 on 2016/11/22 by Chris.Bunner
Don't flag 16-bit PNG/JPG textures as sRGB on import.
#jira UE-30279
Change 3208434 on 2016/11/22 by Rolando.Caloca
DR - vk - UAV transitions
Change 3208490 on 2016/11/22 by Chris.Bunner
Break material code sharing when we detect an unresolvable loop.
By default change IsResultMA loop detection to stop on functions as we can determine type definitively.
Unified IsResultMA detection across switch nodes.
Change 3208860 on 2016/11/23 by Rolando.Caloca
DR - vk - Fix some format issues
Change 3209265 on 2016/11/23 by Arne.Schober
DR - originally unshelved from 3153924 - Made Depth and Velocity Rendering Passes to use PSO only RHI interface,
We are now passing down two structs that collect all the necessary information for the drawing policies to construct a PSO object.
One during construction of the Policy, which contains information abouyt the CullMode, FillMode and PrimType.
And another during rendering that passes infomation like BlendState and DepthStencilState down to the low levelrenderer into SetSharedState.
Performance of the static drawlist ist slightly slower (less than 0.1ms on Consoles) due to some addtional branches and copies. The branches in the FDrawingPolicyRenderState will go away as soon as everything is converted to use the PSO interface.
Performace of the GPU is slightly better due to less context rolls (mainly CullMode sorts in differently now)
Change 3209305 on 2016/11/23 by Guillaume.Abadie
Fix contact shadow's assemption on objects thickness
Change 3209334 on 2016/11/23 by Brian.Karis
Fixed TAA handling of alpha. Switched the meaning of AA_ALPHA to make sense.
Change 3209903 on 2016/11/24 by Guillaume.Abadie
Cherry picks alpha through post processing changelists 3201959, 3204143 and 3209883 from //UE4/Private-Partner-NREAL
Change 3209973 on 2016/11/24 by Ben.Woodhouse
Fix D3D11 and 12 static analysis warnings reported by Rob Troughton of Coconut Lizard (http://coconutlizard.co.uk/blog/ue4/pvs-studio-part5/)
Change 3210023 on 2016/11/24 by Uriel.Doyon
Fixed an issue with DropDetail when FixedFrameRate was set to a value smaller than MinDesiredFrameRate.
#jira UE-37210
Change 3210026 on 2016/11/24 by Ben.Woodhouse
Disable renderthread hang detection if a debugger is present, so we can debug the renderthread without crashing
Change 3210049 on 2016/11/24 by Ben.Woodhouse
Fix mac build
Change 3210071 on 2016/11/24 by Uriel.Doyon
Fixed an issue with masked materials and shader complexity viewmode when DBuffer Decals are enabled.
#jira UE-37542
Change 3210374 on 2016/11/25 by Ben.Woodhouse
* Fix issues with fast cleared dbuffer targets not being resolved when no decals are in the scene. This caused graphical corruption on XB1 and ensure failures on PS4 (with RHIThread disabled)
* Move Decal rendertarget manager function implementations out of the header.
#jira UE-38879
Change 3210390 on 2016/11/25 by Uriel.Doyon
Fixed cubemap resourcesize not taking into account mipgen settings
#jira UE-37045
Change 3210407 on 2016/11/25 by Uriel.Doyon
"resavepackages" commandlet now supports -buildtexturestreaming that rebuilds the map texture streaming data.
That can be used in combination with -buildlighting.
Change 3210563 on 2016/11/27 by Rolando.Caloca
DR - vk - Integrate cached memory fixes and PF_D24 format fix
#jira UE-39025
PR #2974
Change 3210564 on 2016/11/27 by Rolando.Caloca
DR - Fix for GL linker
PR #2975
#jira UE-39029
Change 3210592 on 2016/11/27 by Rolando.Caloca
DR - vk - SM5 fixes
Change 3210597 on 2016/11/27 by Rolando.Caloca
DR - vk - Prep for staging UB copies to GPU memory
Change 3210600 on 2016/11/27 by Rolando.Caloca
DR - vk - Extract generic range code
Change 3210613 on 2016/11/27 by Rolando.Caloca
DR - vk - Added r.Vulkan.SubmitOnDispatch
Change 3211054 on 2016/11/28 by Rolando.Caloca
DR - vk - Missing reference
Change 3211330 on 2016/11/28 by Chris.Bunner
Shader compile error for max texture coordinate count on skinned meshes.
Change 3211384 on 2016/11/28 by Arne.Schober
DR - Enforce move on EnqueueRenderCommand Lambda
Change 3211431 on 2016/11/28 by Gil.Gribb
Merging //UE4/Dev-Main@3211016 to Dev-Rendering (//UE4/Dev-Rendering)
Change 3211738 on 2016/11/28 by Gil.Gribb
IWYU fixes after merge
Change 3212231 on 2016/11/28 by Richard.Wallis
Fix build errors
Change 3212253 on 2016/11/28 by Richard.Wallis
Remove MacGraphicsSwitching plugin.
#jira UE-37640
Change 3212310 on 2016/11/28 by Rolando.Caloca
DR - vk - Update glslang to 1.0.33.0
Change 3212446 on 2016/11/28 by Guillaume.Abadie
Implements PreviousFrameSwitch material expression
Change 3212594 on 2016/11/28 by Arne.Schober
DR - Fix missing include
Change 3212681 on 2016/11/29 by Rolando.Caloca
DR - vk - Auto flush for compute shader
Change 3213000 on 2016/11/29 by Gil.Gribb
temp fix for PF_MAX
Change 3213161 on 2016/11/29 by Ben.Woodhouse
Integrate latest D3D12 changes from //depot/Partners/Microsoft/UE4-DX12/...@3211714
Using:
- p4 integrate //depot/Partners/Microsoft/UE4-DX12/Engine/Source/Runtime/D3D12RHI/...@3211714 //UE4/Dev-Rendering/Engine/Source/Runtime/D3D12RHI/...
- p4 integrate //depot/Partners/Microsoft/UE4-DX12/Engine/Source/ThirdParty/Windows/DirectX/...@3211714 //UE4/Dev-Rendering/Engine/Source/ThirdParty/Windows/DirectX/...
- p4 integrate //depot/Partners/Microsoft/UE4-DX12/Engine/Source/Programs/UnrealBuildTool/...@3211714 //UE4/Dev-Rendering/Engine/Source/Programs/UnrealBuildTool/...
Changes from UE4-DX12:
*** CL 3183818 ***
Update D3D12 RHI to 4.14:
- Merged changes from Epic up until 10/20/16
- Fixed an issue where command allocators where resetting too early. I changed to aggressive command list batching by default now that more SubmitCommandListHint calls exist in the upper engine, we don't need to worry about starving the GPU. Fewer ExecuteCommandLists calls means better performance and fewer Signals() so this change provides a GPU perf win.
I had to fix an issue with aggressive batching where we would sometimes hold on to a command list long enough (in the pending list) but hadn't executed it yet. The command allocator was being put back in the queue of allocators during ReleaseCommandAllocator() without a syncpoint set and was thus being reset too early. I added a simple counter to the command allocator so it could track how many command lists were using it. It doesn't need to be thread safe since only one thread uses a command allocator at a time.
I also added some stats around the # command lists and # command allocators since it would be possible to leak command allocators now if it's pending command list count isn't decremented correctly. In that case we'd keep creating new command allocators and eventually run out of memory.
-Remove clear during allocate in the FD3D12FastConstantAllocator and FD3D12FastAllocator. The supplied resource locations are assumed to be new and thus don't need to be cleared.
-Cleanup D3D12RHI stats. There were some unused stats as well as some missing ones.
-Mark shader resource table uniform buffers as dirty only when the shader changes. Cleanup SetComputeShader calls and Dispatch calls to not set/unset the CS for each Dispatch.
-Remove unused Check SRV resolved code that epic added to the D3D11 RHI and was brought over. We dont need it and we won't use this.
-Remove "always on" cycle counters for high frequency RHI methods like RHISetShaderTexture. These should use the engine's stat macros as they are removed on TEST + SHIPPING builds. On Xbox a significant amount of CPU time is spent in things like QueryPerformanceCounter even when STATS aren't enabled. Currently 1% of an entire capture on XBOX is spent inside this call.
I improved and cleaned up high freqency call stacks like:
- RHISetShaderTexture
- RHISetShaderResourceViewParameter
- RHISetShaderParameter
- RHISetUAVParameter
In general I moved to use templated functions, removed unused parameters, unnecessary copies, etc.
-Change D3D12 PIX profiling enable/disable to match Xbox and handle logic in the UEBuildWindows.cs for UBT. Also add a static assert to inform the developer when PIX profiling is requested but the engine is compiling out draw events.
-Resources should be associated with the rendering thread's frame that it's currently recording command lists for and they shouldnt be cleaned up until those command lists have been translated to D3D12 command lists on the RHI thread AND completed executing on the GPU. This was confirmed to resolve an issue where CBV resources were being released too early.
This work involved a couple changes:
1) Move the "frame" fence to be incremented on the rendering thread (during RHIAdvanceFrameForGetViewportBackBuffer()) so that resources that are deleted from the rendering thread are assosicated with the correct frame count
2) Queue up a command from the rendering thread to signal the "frame" fence. It needs to be queued to ensure that it's signaled at the correct time on the RHI thread (after that frame's command lists have been executed).
-Disable GRHIRequiresEarlyBackBufferRenderTarget. Metal/Vulkan/Xbox11.x already do this. This is used by the Slate renderer during BeginRenderFrame and avoids a SetRenderTargets call.
-Enable GRHISupportsMSAADepthSampleAccess (used in the Editor). This was enabled for D3D11 on SM5, but not for D3D12.
-Delay load D3D12.dll and add root signature 1.1 support.
-Add explicit flush calls to improve resource barrier batching instead of implict flushes inside FConditionalScopeResourceBarrier and FScopeResourceBarrier. Also update those classes with const members.
*** CL 3183824 ***
Fix the D3D12 RHI after integrating UE 4.14 updates:
- Fixed a bug where we would try to get the PSO of a nullptr in SetPipelineState if we needed to reset the current PSO on the cmd list.
- Fixed a spelling error
- Removed the need for bForceState, we use dirty bits now
*** CL 3183830 ***
- GetDebugFlags RHI extension, needed by XB1 movie player.
- Only query memory info if stats are enabled
- Add support for the engine's new RHISubmitCommandsAndFlushGPU function
- Update CommitPendingPipelineState to be Graphics/Compute specific and avoid the need for a IsCompute parameter.
*** CL 3183837 ***
Made PipelineState caches contain pointers to FD3D12PipelineState objects to avoid issues with using pointers to after Find/Add to the maps. TMap indicates that the pointer to the value associated with a key "is only valid until the next change to any key in the map." The lifetime of the PSO pointers is managed by the low level caches (graphics and compute). Added stat for the number of Pipeline State Objects.
*** CL 3183931 ***
Update Windows D3D12 headers and libs to RS1 release bits (10.0.14393.0)
*** CL 3183978 ***
Update UBT Windows build settings:
- Change D3D12 PIX profiling enable/disable to match Xbox and handle logic in the UEBuildWindows.cs for UBT. Also add a static assert to inform the developer when PIX profiling is requested but the engine is compiling out draw events.
-Delay load D3D12.dll and add root signature 1.1 support.
*** CL 3184132 ***
Fix Xbox PSO cache code where it could leak PSOs. Related to change 3183837.
*** Changelist 3211714 ***
Update D3D12 RHI with fixes:
- Check if we can reserve slots in GatherUniqueSamplerTables
- DirtyState more often in StateCache
- Remove InternalSetSamplerState. The alternate function isn't used.
- Allow MRTClear for arrays with holes in them
- Fix uninitialized descriptors. This was causing a GPU hang on Xbox. We need to set dirty bits for resources bound to slots outside of the current descriptor table's range
- Cleanup SetDescriptorHeap code. Move setting descriptor heap logic to the descriptor cache since it also owns things like the sampler maps. Added members to the descriptor cache to track the last heaps that were set on the command list to avoid dirtying bit unnecessarily.
- Resource transitions: go through Common between queues (3D <--> Compute)
- Fix initial state for placed resources.
- Merging epic
Change 3213250 on 2016/11/29 by Chris.Bunner
GBufferHints tooltip fix.
#jira UE-39103
Change 3213345 on 2016/11/29 by Gil.Gribb
more IWYU fallout
Change 3213676 on 2016/11/29 by Rolando.Caloca
DR - Fix incorrect texture getting cleared
Change 3213728 on 2016/11/29 by Rolando.Caloca
DR - Lambda-ize
Change 3214461 on 2016/11/29 by Ben.Woodhouse
Rollout August QFE4 XDK (required for latest DX12 changes on XB1)
Change 3215317 on 2016/11/30 by Daniel.Wright
PS4 compile fix
Change 3216343 on 2016/11/30 by Arne.Schober
DR - UE-39155 - after talking to Brian it occurred to us that flipping the world space normal is non sensical. And indeed the Grass was using world space normals.
Change 3216844 on 2016/12/01 by Ben.Woodhouse
Fix for static analysis warnings after discussion with Microsoft
Change 3216916 on 2016/12/01 by Gil.Gribb
Merging //UE4/Dev-Main@3216539 to Dev-Rendering (//UE4/Dev-Rendering)
Change 3217385 on 2016/12/01 by Arne.Schober
DR - UE-39218, UE-39221, UE-39224 and potentially UE-39214 - The Stencil bits for Light channels and decal application were not set in the dynamic basepass
Change 3217464 on 2016/12/01 by Ben.Woodhouse
Fix for reflection capture resize assert. The assert is only valid in cooked builds, so disable it in editor
#jira UE-39225
Change 3217534 on 2016/12/01 by Arne.Schober
DR - Fix Merge conflict
Change 3217581 on 2016/12/01 by Rolando.Caloca
DR - Fix assert on debug
Change 3217741 on 2016/12/01 by Benjamin.Hyder
Duplicate audio fix.
Change 3217890 on 2016/12/01 by Rolando.Caloca
DR - Fix widget not rendering properly when hidden
#jira UE-39221
Change 3218129 on 2016/12/01 by Arne.Schober
DR - UE-39214 - Lod dither value as accidently cached accross the static draw list.
Change 3218759 on 2016/12/02 by Guillaume.Abadie
Fixes editor compositing bug caused by alpha through post processing change 3209903
#jira UE-39221
[CL 3219854 by Marcus Wassmer in Main branch]
2016-12-02 16:43:04 -05:00
TArray < FBox > PendingStreamingHeightfieldBoxes ;
for ( const FPrimitiveSceneInfo * HeightfieldPrimitive : Scene - > DistanceFieldSceneData . HeightfieldPrimitives )
{
if ( HeightfieldPrimitive - > Proxy - > HeightfieldHasPendingStreaming ( ) )
{
PendingStreamingHeightfieldBoxes . Add ( HeightfieldPrimitive - > Proxy - > GetBounds ( ) . GetBox ( ) ) ;
}
}
2015-05-11 20:04:15 -04:00
if ( View . ViewState )
{
View . ViewState - > GlobalDistanceFieldUpdateIndex + + ;
2020-07-06 18:58:26 -04:00
if ( View . ViewState - > GlobalDistanceFieldUpdateIndex > 128 )
2015-05-11 20:04:15 -04:00
{
View . ViewState - > GlobalDistanceFieldUpdateIndex = 0 ;
}
2020-07-06 18:58:26 -04:00
int32 NumClipmapUpdateRequests = 0 ;
2020-11-19 05:23:44 -04:00
FViewElementPDI ViewPDI ( & View , nullptr , & View . DynamicPrimitiveCollector ) ;
2020-07-06 18:58:26 -04:00
2020-09-08 17:44:06 -04:00
bool bSharedDataReallocated = false ;
GlobalDistanceFieldInfo . PageFreeListAllocatorBuffer = nullptr ;
GlobalDistanceFieldInfo . PageFreeListBuffer = nullptr ;
GlobalDistanceFieldInfo . PageAtlasTexture = nullptr ;
2022-03-01 21:07:45 -05:00
GlobalDistanceFieldInfo . CoverageAtlasTexture = nullptr ;
2020-09-08 17:44:06 -04:00
if ( View . ViewState )
{
FSceneViewState & ViewState = * View . ViewState ;
2022-01-26 17:07:27 -05:00
const int32 MaxPageNum = GlobalDistanceField : : GetMaxPageNum ( bLumenEnabled , View . FinalPostProcessSettings . LumenSceneViewDistance ) ;
const FIntVector PageAtlasTextureSize = GlobalDistanceField : : GetPageAtlasSize ( bLumenEnabled , View . FinalPostProcessSettings . LumenSceneViewDistance ) ;
2020-09-08 17:44:06 -04:00
if ( ! ViewState . GlobalDistanceFieldPageFreeListAllocatorBuffer )
{
2022-04-25 13:00:12 -04:00
ViewState . GlobalDistanceFieldPageFreeListAllocatorBuffer = AllocatePooledBuffer ( FRDGBufferDesc : : CreateStructuredDesc ( sizeof ( uint32 ) , 1 ) , TEXT ( " PageFreeListAllocator " ) ) ;
2020-09-08 17:44:06 -04:00
}
if ( ! ViewState . GlobalDistanceFieldPageFreeListBuffer
| | ViewState . GlobalDistanceFieldPageFreeListBuffer - > Desc . NumElements ! = MaxPageNum )
{
2022-04-25 13:00:12 -04:00
ViewState . GlobalDistanceFieldPageFreeListBuffer = AllocatePooledBuffer ( FRDGBufferDesc : : CreateStructuredDesc ( sizeof ( uint32 ) , MaxPageNum ) , TEXT ( " PageFreeList " ) ) ;
2020-09-08 17:44:06 -04:00
}
if ( ! ViewState . GlobalDistanceFieldPageAtlasTexture
| | ViewState . GlobalDistanceFieldPageAtlasTexture - > GetDesc ( ) . Extent . X ! = PageAtlasTextureSize . X
| | ViewState . GlobalDistanceFieldPageAtlasTexture - > GetDesc ( ) . Extent . Y ! = PageAtlasTextureSize . Y
| | ViewState . GlobalDistanceFieldPageAtlasTexture - > GetDesc ( ) . Depth ! = PageAtlasTextureSize . Z )
{
FPooledRenderTargetDesc VolumeDesc = FPooledRenderTargetDesc ( FPooledRenderTargetDesc : : CreateVolumeDesc (
PageAtlasTextureSize . X ,
PageAtlasTextureSize . Y ,
PageAtlasTextureSize . Z ,
PF_R8 ,
FClearValueBinding : : None ,
2020-09-24 00:43:27 -04:00
TexCreate_None ,
2020-09-08 17:44:06 -04:00
// TexCreate_ReduceMemoryWithTilingMode used because 128^3 texture comes out 4x bigger on PS4 with recommended volume texture tiling modes
2022-02-15 23:38:23 -05:00
TexCreate_ShaderResource | TexCreate_UAV | TexCreate_ReduceMemoryWithTilingMode | TexCreate_3DTiling ,
2020-09-08 17:44:06 -04:00
false ) ) ;
GRenderTargetPool . FindFreeElement (
RHICmdList ,
VolumeDesc ,
ViewState . GlobalDistanceFieldPageAtlasTexture ,
2022-01-13 17:56:22 -05:00
TEXT ( " GlobalDistanceFieldPageAtlas " )
2020-09-08 17:44:06 -04:00
) ;
bSharedDataReallocated = true ;
}
2022-03-01 21:07:45 -05:00
const FIntVector CoverageAtlasTextureSize = GlobalDistanceField : : GetCoverageAtlasSize ( bLumenEnabled , View . FinalPostProcessSettings . LumenSceneViewDistance ) ;
if ( bLumenEnabled
& & ( ! ViewState . GlobalDistanceFieldCoverageAtlasTexture
| | ViewState . GlobalDistanceFieldCoverageAtlasTexture - > GetDesc ( ) . Extent . X ! = CoverageAtlasTextureSize . X
| | ViewState . GlobalDistanceFieldCoverageAtlasTexture - > GetDesc ( ) . Extent . Y ! = CoverageAtlasTextureSize . Y
| | ViewState . GlobalDistanceFieldCoverageAtlasTexture - > GetDesc ( ) . Depth ! = CoverageAtlasTextureSize . Z ) )
{
FPooledRenderTargetDesc VolumeDesc = FPooledRenderTargetDesc ( FPooledRenderTargetDesc : : CreateVolumeDesc (
CoverageAtlasTextureSize . X ,
CoverageAtlasTextureSize . Y ,
CoverageAtlasTextureSize . Z ,
PF_R8 ,
FClearValueBinding : : None ,
TexCreate_None ,
TexCreate_ShaderResource | TexCreate_UAV | TexCreate_ReduceMemoryWithTilingMode | TexCreate_3DTiling ,
false ) ) ;
GRenderTargetPool . FindFreeElement (
RHICmdList ,
VolumeDesc ,
ViewState . GlobalDistanceFieldCoverageAtlasTexture ,
TEXT ( " GlobalDistanceFieldCoverageAtlas " )
) ;
bSharedDataReallocated = true ;
}
2020-09-08 17:44:06 -04:00
GlobalDistanceFieldInfo . PageFreeListAllocatorBuffer = ViewState . GlobalDistanceFieldPageFreeListAllocatorBuffer ;
GlobalDistanceFieldInfo . PageFreeListBuffer = ViewState . GlobalDistanceFieldPageFreeListBuffer ;
GlobalDistanceFieldInfo . PageAtlasTexture = ViewState . GlobalDistanceFieldPageAtlasTexture ;
2022-03-01 21:07:45 -05:00
GlobalDistanceFieldInfo . CoverageAtlasTexture = ViewState . GlobalDistanceFieldCoverageAtlasTexture ;
2020-09-08 17:44:06 -04:00
}
2022-02-08 16:53:35 -05:00
if ( GAOGlobalDistanceFieldCacheMostlyStaticSeparately )
2020-09-08 17:44:06 -04:00
{
2022-01-26 17:07:27 -05:00
const FIntVector PageTableTextureResolution = GlobalDistanceField : : GetPageTableTextureResolution ( bLumenEnabled , View . FinalPostProcessSettings . LumenSceneViewDistance ) ;
2020-09-08 17:44:06 -04:00
TRefCountPtr < IPooledRenderTarget > & PageTableTexture = View . ViewState - > GlobalDistanceFieldPageTableCombinedTexture ;
if ( ! PageTableTexture
| | PageTableTexture - > GetDesc ( ) . Extent . X ! = PageTableTextureResolution . X
| | PageTableTexture - > GetDesc ( ) . Extent . Y ! = PageTableTextureResolution . Y
| | PageTableTexture - > GetDesc ( ) . Depth ! = PageTableTextureResolution . Z )
{
FPooledRenderTargetDesc VolumeDesc = FPooledRenderTargetDesc ( FPooledRenderTargetDesc : : CreateVolumeDesc (
PageTableTextureResolution . X ,
PageTableTextureResolution . Y ,
PageTableTextureResolution . Z ,
2022-04-22 19:55:41 -04:00
PF_R32_UINT ,
2020-09-08 17:44:06 -04:00
FClearValueBinding : : None ,
2020-09-24 00:43:27 -04:00
TexCreate_None ,
2020-09-08 17:44:06 -04:00
TexCreate_ShaderResource | TexCreate_UAV | TexCreate_ReduceMemoryWithTilingMode | TexCreate_3DTiling ,
false ) ) ;
GRenderTargetPool . FindFreeElement (
RHICmdList ,
VolumeDesc ,
PageTableTexture ,
2022-01-13 17:56:22 -05:00
TEXT ( " DistanceFieldPageTableCombined " )
2020-09-08 17:44:06 -04:00
) ;
bSharedDataReallocated = true ;
}
GlobalDistanceFieldInfo . PageTableCombinedTexture = PageTableTexture ;
}
2020-09-15 11:03:59 -04:00
{
2021-02-04 15:30:42 -04:00
const int32 ClipmapMipResolution = GlobalDistanceField : : GetClipmapMipResolution ( bLumenEnabled ) ;
2022-01-26 17:07:27 -05:00
const FIntVector MipTextureResolution = FIntVector ( ClipmapMipResolution , ClipmapMipResolution , ClipmapMipResolution * GetNumGlobalDistanceFieldClipmaps ( bLumenEnabled , View . FinalPostProcessSettings . LumenSceneViewDistance ) ) ;
2020-09-15 11:03:59 -04:00
TRefCountPtr < IPooledRenderTarget > & MipTexture = View . ViewState - > GlobalDistanceFieldMipTexture ;
if ( ! MipTexture
| | MipTexture - > GetDesc ( ) . Extent . X ! = MipTextureResolution . X
| | MipTexture - > GetDesc ( ) . Extent . Y ! = MipTextureResolution . Y
| | MipTexture - > GetDesc ( ) . Depth ! = MipTextureResolution . Z )
{
FPooledRenderTargetDesc VolumeDesc = FPooledRenderTargetDesc ( FPooledRenderTargetDesc : : CreateVolumeDesc (
MipTextureResolution . X ,
MipTextureResolution . Y ,
MipTextureResolution . Z ,
PF_R8 ,
FClearValueBinding : : None ,
2020-09-24 00:43:27 -04:00
TexCreate_None ,
2020-09-15 11:03:59 -04:00
TexCreate_ShaderResource | TexCreate_UAV | TexCreate_ReduceMemoryWithTilingMode | TexCreate_3DTiling ,
false ) ) ;
GRenderTargetPool . FindFreeElement (
RHICmdList ,
VolumeDesc ,
MipTexture ,
2022-01-13 17:56:22 -05:00
TEXT ( " GlobalSDFMipTexture " )
2020-09-15 11:03:59 -04:00
) ;
bSharedDataReallocated = true ;
}
GlobalDistanceFieldInfo . MipTexture = MipTexture ;
}
2020-09-08 17:44:06 -04:00
for ( uint32 CacheType = 0 ; CacheType < GDF_Num ; CacheType + + )
{
2022-01-26 17:07:27 -05:00
const FIntVector PageTableTextureResolution = GlobalDistanceField : : GetPageTableTextureResolution ( bLumenEnabled , View . FinalPostProcessSettings . LumenSceneViewDistance ) ;
2020-09-08 17:44:06 -04:00
TRefCountPtr < IPooledRenderTarget > & PageTableTexture = View . ViewState - > GlobalDistanceFieldPageTableLayerTextures [ CacheType ] ;
if ( CacheType = = GDF_Full | | GAOGlobalDistanceFieldCacheMostlyStaticSeparately )
{
if ( ! PageTableTexture
| | PageTableTexture - > GetDesc ( ) . Extent . X ! = PageTableTextureResolution . X
| | PageTableTexture - > GetDesc ( ) . Extent . Y ! = PageTableTextureResolution . Y
| | PageTableTexture - > GetDesc ( ) . Depth ! = PageTableTextureResolution . Z )
{
FPooledRenderTargetDesc VolumeDesc = FPooledRenderTargetDesc ( FPooledRenderTargetDesc : : CreateVolumeDesc (
PageTableTextureResolution . X ,
PageTableTextureResolution . Y ,
PageTableTextureResolution . Z ,
2022-04-22 19:55:41 -04:00
PF_R32_UINT ,
2020-09-08 17:44:06 -04:00
FClearValueBinding : : None ,
2020-09-24 00:43:27 -04:00
TexCreate_None ,
2020-09-08 17:44:06 -04:00
TexCreate_ShaderResource | TexCreate_UAV | TexCreate_ReduceMemoryWithTilingMode | TexCreate_3DTiling ,
false ) ) ;
GRenderTargetPool . FindFreeElement (
RHICmdList ,
VolumeDesc ,
PageTableTexture ,
2022-01-13 17:56:22 -05:00
CacheType = = GDF_MostlyStatic ? TEXT ( " GlobalDistanceFieldPageTableStationaryLayer " ) : TEXT ( " GlobalDistanceFieldPageTableMovableLayer " )
2020-09-08 17:44:06 -04:00
) ;
bSharedDataReallocated = true ;
}
}
GlobalDistanceFieldInfo . PageTableLayerTextures [ CacheType ] = PageTableTexture ;
}
2015-05-11 20:04:15 -04:00
for ( int32 ClipmapIndex = 0 ; ClipmapIndex < NumClipmaps ; ClipmapIndex + + )
{
FGlobalDistanceFieldClipmapState & ClipmapViewState = View . ViewState - > GlobalDistanceFieldClipmapState [ ClipmapIndex ] ;
2021-02-04 15:30:42 -04:00
const int32 ClipmapResolution = GlobalDistanceField : : GetClipmapResolution ( bLumenEnabled ) ;
const float ClipmapExtent = GlobalDistanceField : : GetClipmapExtent ( ClipmapIndex , Scene , bLumenEnabled ) ;
2020-09-08 17:44:06 -04:00
const float ClipmapVoxelSize = ( 2.0f * ClipmapExtent ) / ClipmapResolution ;
const float ClipmapPageSize = GGlobalDistanceFieldPageResolution * ClipmapVoxelSize ;
2020-09-15 11:03:59 -04:00
const float ClipmapInfluenceRadius = GGlobalDistanceFieldInfluenceRangeInVoxels * ClipmapVoxelSize ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
2015-05-11 20:43:54 -04:00
// Accumulate primitive modifications in the viewstate in case we don't update the clipmap this frame
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
for ( uint32 CacheType = 0 ; CacheType < GDF_Num ; CacheType + + )
{
2022-02-08 16:53:35 -05:00
const uint32 DestCacheType = GAOGlobalDistanceFieldCacheMostlyStaticSeparately ? CacheType : GDF_Full ;
ClipmapViewState . Cache [ DestCacheType ] . PrimitiveModifiedBounds . Append ( Scene - > DistanceFieldSceneData . PrimitiveModifiedBounds [ CacheType ] ) ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
}
2015-05-11 20:43:54 -04:00
2020-09-08 17:44:06 -04:00
const bool bForceFullUpdate = bSharedDataReallocated
Copying //UE4/Dev-Framework to Dev-Main (//UE4/Dev-Main) @ 2944217
#lockdown Nick.Penwarden
==========================
MAJOR FEATURES + CHANGES
==========================
Change 2899855 on 2016/03/08 by Marc.Audy
Merging //UE4/Dev-Main to Dev-Framework (//UE4/Dev-Framework) @ 2899785
Change 2926689 on 2016/03/29 by Jeff.Farris
AAIController::SetFocus() will now implicitly clear any location focus at the same priority.
UE-27975
#rb john.abercrombie
Change 2926690 on 2016/03/29 by Jeff.Farris
Using wildcard operator with the "KismetEvent" or "ke" console commands will now only trigger the event on objects in the world in which it was triggered. Prevents badness with running events on things like CDOs and editor actors. (UE-23106)
Change 2926691 on 2016/03/29 by mason.seay
Content for testing collision on scaled components
Change 2926692 on 2016/03/29 by Jeff.Farris
- FixupDeltaSeconds now considers time dilation when clamping.
- Acceptable range for time dilation values is now a config parameter on WorldSettings
- Acceptable range for undilated frame times is now a config parameter on WorldSettings
(UE-27815)
#rb marc.audy
Change 2926711 on 2016/03/29 by Ori.Cohen
Fix constraint rendering when scaling a cosntraint actor
#JIRA UE-28691, UE-28700
#rb Lina.Halper
Change 2926745 on 2016/03/29 by Lukasz.Furman
navigation filters can now be instantiated per querier - usually AI agent
required for FORT-21372
Change 2926789 on 2016/03/29 by Ori.Cohen
Downgrade check to ensure for 2d physics during a hard shutdown
#rb Michael.Noland
Change 2926859 on 2016/03/29 by Ori.Cohen
Fix red herring warnings of not locking physx scenes during hard shutdown.
#JIRA UE-28747
#rb Michael.Noland
Change 2927444 on 2016/03/30 by Thomas.Sarkanen
Fixed Blueprint compiler errors when resetting timer handles
Added basic support for 64-bit int/uint terms to Blueprint. This allows the use of opaque 64-bit integer types inside of BlueprintType structs, it in no way means that 64-bit ints are fully supported in Blueprint.
Corrected a left-over formatting oversight when converting a FTimerHandle to a string.
Added new by-ref "Clear and Invalidate Timer by Handle" function to Blueprint system library & deprecated old version.
#rb Maciej.Mroz (and a few others!)
#jira UE-28833 - Unresolved compiler error for B_Pickups blueprint in Fortnite
Change 2927520 on 2016/03/30 by Jurre.deBaare
Should not allow skeletal mesh components mobility to be set to static, but detach instead
#fix Added CanHaveStaticMobility to SceneComponent class, and check this when trying to propogate Static mobility to parent component
#jira UE-26364
Change 2927533 on 2016/03/30 by Jurre.deBaare
Static Mesh Merge tool: when merging from multiple blueprints, fails to combine same materials
#fix Material index remapping was part of if-clause where it shouldn't be
#jira UE-23827
Static Mesh Merge tool, failed to combine physics data if using complex
#fix Required copying the SectionInfoMap from source static meshes
HLOD/MergeActor - Vertex Colours are not correctly propagated to negatively scaled meshes
#fix had to re-order function calls
#jira UE-28316
#rb James.Golding
Change 2927535 on 2016/03/30 by Ori.Cohen
Make sub-stepping run on game thread
#JIRA UE-24011
#rb Gil.Gribb
Change 2927537 on 2016/03/30 by Jurre.deBaare
Warning message when HLOD mesh > 65536 vertices
#jira UE-22365
#fix added messages when building proxy mesh
Change 2927691 on 2016/03/30 by Jeff.Farris
Fixed potential PlayerState leak (UE-22700)
Change 2927692 on 2016/03/30 by Lina.Halper
Allow it to select any name they want other than just restrict to what we have.
- I think it may not be the best solution but with current widget built, you can't even clear name, which is problem.
- Other solution is to add "Clear" as a name, and when that gets entered, we just clear it, but then the X button is odd and no purpose being there.
- I think we should just allow them to choose if they don't like it but with suggestions.
#rb: Ori.Cohen
#jira UE-27786
#code review: Benn.Gallagher
Change 2927853 on 2016/03/30 by Lina.Halper
[CL 2944273 by Marc Audy in Main branch]
2016-04-14 16:25:11 -04:00
| | ! View . ViewState - > bInitializedGlobalDistanceFieldOrigins
2016-04-04 18:44:59 -04:00
// Detect when max occlusion distance has changed
2020-09-08 17:44:06 -04:00
| | ClipmapViewState . CachedClipmapExtent ! = ClipmapExtent
2018-09-11 14:44:10 -04:00
| | ClipmapViewState . CacheMostlyStaticSeparately ! = GAOGlobalDistanceFieldCacheMostlyStaticSeparately
2019-01-10 04:23:30 -05:00
| | ClipmapViewState . LastUsedSceneDataForFullUpdate ! = & Scene - > DistanceFieldSceneData
2020-02-06 17:56:50 -05:00
| | GAOGlobalDistanceFieldForceFullUpdate
| | GDFReadbackRequest ! = nullptr ;
2016-04-04 18:44:59 -04:00
2022-02-10 09:28:01 -05:00
const bool bUpdateRequested = GAOUpdateGlobalDistanceField ! = 0 & & ShouldUpdateClipmapThisFrame ( ClipmapIndex , NumClipmaps , View . ViewState - > GlobalDistanceFieldUpdateIndex ) ;
2020-07-06 18:58:26 -04:00
if ( bUpdateRequested )
2015-05-11 20:04:15 -04:00
{
2020-07-06 18:58:26 -04:00
NumClipmapUpdateRequests + + ;
}
if ( bUpdateRequested | | bForceFullUpdate )
{
2021-02-04 15:30:42 -04:00
const FVector GlobalDistanceFieldViewOrigin = GetGlobalDistanceFieldViewOrigin ( View , ClipmapIndex , bLumenEnabled ) ;
2015-05-11 20:04:15 -04:00
2020-09-08 17:44:06 -04:00
// Snap to the global distance field page's size
FIntVector PageGridCenter ;
PageGridCenter . X = FMath : : RoundToInt ( GlobalDistanceFieldViewOrigin . X / ClipmapPageSize ) ;
PageGridCenter . Y = FMath : : RoundToInt ( GlobalDistanceFieldViewOrigin . Y / ClipmapPageSize ) ;
PageGridCenter . Z = FMath : : RoundToInt ( GlobalDistanceFieldViewOrigin . Z / ClipmapPageSize ) ;
2015-05-11 20:04:15 -04:00
2020-09-08 17:44:06 -04:00
const FVector SnappedCenter = FVector ( PageGridCenter ) * ClipmapPageSize ;
const FBox ClipmapBounds ( SnappedCenter - ClipmapExtent , SnappedCenter + ClipmapExtent ) ;
2015-05-11 20:04:15 -04:00
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
const bool bUsePartialUpdates = GAOGlobalDistanceFieldPartialUpdates & & ! bForceFullUpdate ;
2015-05-11 20:04:15 -04:00
if ( ! bUsePartialUpdates )
{
// Store the location of the full update
2020-09-08 17:44:06 -04:00
ClipmapViewState . FullUpdateOriginInPages = PageGridCenter ;
Copying //UE4/Dev-Framework to Dev-Main (//UE4/Dev-Main) @ 2944217
#lockdown Nick.Penwarden
==========================
MAJOR FEATURES + CHANGES
==========================
Change 2899855 on 2016/03/08 by Marc.Audy
Merging //UE4/Dev-Main to Dev-Framework (//UE4/Dev-Framework) @ 2899785
Change 2926689 on 2016/03/29 by Jeff.Farris
AAIController::SetFocus() will now implicitly clear any location focus at the same priority.
UE-27975
#rb john.abercrombie
Change 2926690 on 2016/03/29 by Jeff.Farris
Using wildcard operator with the "KismetEvent" or "ke" console commands will now only trigger the event on objects in the world in which it was triggered. Prevents badness with running events on things like CDOs and editor actors. (UE-23106)
Change 2926691 on 2016/03/29 by mason.seay
Content for testing collision on scaled components
Change 2926692 on 2016/03/29 by Jeff.Farris
- FixupDeltaSeconds now considers time dilation when clamping.
- Acceptable range for time dilation values is now a config parameter on WorldSettings
- Acceptable range for undilated frame times is now a config parameter on WorldSettings
(UE-27815)
#rb marc.audy
Change 2926711 on 2016/03/29 by Ori.Cohen
Fix constraint rendering when scaling a cosntraint actor
#JIRA UE-28691, UE-28700
#rb Lina.Halper
Change 2926745 on 2016/03/29 by Lukasz.Furman
navigation filters can now be instantiated per querier - usually AI agent
required for FORT-21372
Change 2926789 on 2016/03/29 by Ori.Cohen
Downgrade check to ensure for 2d physics during a hard shutdown
#rb Michael.Noland
Change 2926859 on 2016/03/29 by Ori.Cohen
Fix red herring warnings of not locking physx scenes during hard shutdown.
#JIRA UE-28747
#rb Michael.Noland
Change 2927444 on 2016/03/30 by Thomas.Sarkanen
Fixed Blueprint compiler errors when resetting timer handles
Added basic support for 64-bit int/uint terms to Blueprint. This allows the use of opaque 64-bit integer types inside of BlueprintType structs, it in no way means that 64-bit ints are fully supported in Blueprint.
Corrected a left-over formatting oversight when converting a FTimerHandle to a string.
Added new by-ref "Clear and Invalidate Timer by Handle" function to Blueprint system library & deprecated old version.
#rb Maciej.Mroz (and a few others!)
#jira UE-28833 - Unresolved compiler error for B_Pickups blueprint in Fortnite
Change 2927520 on 2016/03/30 by Jurre.deBaare
Should not allow skeletal mesh components mobility to be set to static, but detach instead
#fix Added CanHaveStaticMobility to SceneComponent class, and check this when trying to propogate Static mobility to parent component
#jira UE-26364
Change 2927533 on 2016/03/30 by Jurre.deBaare
Static Mesh Merge tool: when merging from multiple blueprints, fails to combine same materials
#fix Material index remapping was part of if-clause where it shouldn't be
#jira UE-23827
Static Mesh Merge tool, failed to combine physics data if using complex
#fix Required copying the SectionInfoMap from source static meshes
HLOD/MergeActor - Vertex Colours are not correctly propagated to negatively scaled meshes
#fix had to re-order function calls
#jira UE-28316
#rb James.Golding
Change 2927535 on 2016/03/30 by Ori.Cohen
Make sub-stepping run on game thread
#JIRA UE-24011
#rb Gil.Gribb
Change 2927537 on 2016/03/30 by Jurre.deBaare
Warning message when HLOD mesh > 65536 vertices
#jira UE-22365
#fix added messages when building proxy mesh
Change 2927691 on 2016/03/30 by Jeff.Farris
Fixed potential PlayerState leak (UE-22700)
Change 2927692 on 2016/03/30 by Lina.Halper
Allow it to select any name they want other than just restrict to what we have.
- I think it may not be the best solution but with current widget built, you can't even clear name, which is problem.
- Other solution is to add "Clear" as a name, and when that gets entered, we just clear it, but then the X button is odd and no purpose being there.
- I think we should just allow them to choose if they don't like it but with suggestions.
#rb: Ori.Cohen
#jira UE-27786
#code review: Benn.Gallagher
Change 2927853 on 2016/03/30 by Lina.Halper
[CL 2944273 by Marc Audy in Main branch]
2016-04-14 16:25:11 -04:00
View . ViewState - > bInitializedGlobalDistanceFieldOrigins = true ;
2020-09-08 17:44:06 -04:00
View . ViewState - > bGlobalDistanceFieldPendingReset = true ;
2019-01-10 04:23:30 -05:00
ClipmapViewState . LastUsedSceneDataForFullUpdate = & Scene - > DistanceFieldSceneData ;
2015-05-11 20:04:15 -04:00
}
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
const FGlobalDFCacheType StartCacheType = GAOGlobalDistanceFieldCacheMostlyStaticSeparately ? GDF_MostlyStatic : GDF_Full ;
2020-07-06 18:58:26 -04:00
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
for ( uint32 CacheType = StartCacheType ; CacheType < GDF_Num ; CacheType + + )
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3185985)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3170391 on 2016/10/21 by Ben.Woodhouse
Remove the wait on end of frame ensure, because we can't rely on all the the underlying codepaths to never miss a call to flush RHI resources. The consequences of missing a flush on a given frame are not serious now, since we enforce the synchronisation with a fence, preventing the rendering thread from getting too far ahead. We will simply accumulate resources for an additional frame when this happens.
#jira UE-37437
#fyi rolando.caloca, marcus.wassmer
Change 3170659 on 2016/10/21 by Rolando.Caloca
DR - vk - Prep work for state key changes
Change 3170676 on 2016/10/21 by Rolando.Caloca
DR - vk - Reworked blend state keys
- Added depth/stencil to pipeline key
Change 3170848 on 2016/10/21 by Daniel.Wright
Level viewport 'show stats' option is now enabled by default, which avoids confusion with artists thinking lighting is built, when really the message is hidden.
Change 3170849 on 2016/10/21 by Daniel.Wright
Split FProjectedShadowInfo::RenderProjection into smaller functions which make the algorithm structure clear
Change 3170995 on 2016/10/21 by Rolando.Caloca
DR - vk - Show object on vulkan validation msgs
Change 3171085 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix pipelines being used with incompatible renderpasses
Change 3171159 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix layout when reading textures on CPU
Change 3171167 on 2016/10/21 by Rolando.Caloca
DR - vk - compile fix
Change 3172462 on 2016/10/24 by Daniel.Wright
Added a warning about shader compile times to the material tooltip
Change 3172463 on 2016/10/24 by Daniel.Wright
Reduced MinUnoccludedFraction to avoid artitfacts when a stationary light touches only a tiny part of a mesh
Change 3172716 on 2016/10/24 by Brian.Karis
Fix for crash UE-37369 when reimporting over a generated LOD.
Change 3172967 on 2016/10/24 by Rolando.Caloca
DR - vk - Fix writing buffers while GPU was using them
Change 3174187 on 2016/10/25 by Olaf.Piesche
UE-37020
Change 3174718 on 2016/10/26 by Rolando.Caloca
DR - vk - Remove old timestamp queries, increase occlusion queries per pool to 4k
Change 3175960 on 2016/10/26 by Rolando.Caloca
DR - Added support for hlslcc header to have custom parsing
Change 3176611 on 2016/10/27 by David.Hill
DrawWireCone confusion:
In response to a UDN, I'm updating confusing parameter names and comments for
DrawWireCone() and DrawWireSphereCappedCone()
Change 3177111 on 2016/10/27 by Rolando.Caloca
DR - vk - Fix timestamps for frame
Change 3177192 on 2016/10/27 by Arne.Schober
DR - DitherLOD refactor - moved computation of the DepthStencil state out of SetMeshRenderState into GetDitheredLODTransitionState this is a prerequisite of further PSO work where we want to move up State setting in a similar war and reuse FMeshDrawingRenderState
Change 3177278 on 2016/10/27 by Olaf.Piesche
UE-37484
Change 3177297 on 2016/10/27 by Rolando.Caloca
DR - vk - Enable GRHISupportsBaseVertexIndex
Change 3177607 on 2016/10/27 by Rolando.Caloca
DR - vk - SM4 UB prep
Change 3178052 on 2016/10/28 by Arne.Schober
DR - fix WebGL - the WebGL compiler is very picky on double underscores and does want the presission to be defined before any function definition.
Change 3178156 on 2016/10/28 by Rolando.Caloca
DR - vk - Added query timer
- Fixed inline issues
Change 3178158 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for out of stencil bits
Change 3178462 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for Elemental
Change 3179131 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix for r.Vulkan.UseRealUBs
Change 3179139 on 2016/10/28 by Rolando.Caloca
DR - vk - Move UB ring buffer to context
Change 3179145 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix buffer barriers
Change 3179888 on 2016/10/31 by Rolando.Caloca
DR - vk - Align buffers to 16 bytes as we sometimes write to them with SIMD
Change 3179923 on 2016/10/31 by Rolando.Caloca
DR - vk - Wait for swapchain counter
Change 3180430 on 2016/10/31 by Rolando.Caloca
DR - vk - Properly wait for occlusion queries/cmd buffer
- Actual log error if trying to use occlusion queries out of order
Change 3180746 on 2016/10/31 by Rolando.Caloca
DR - vk - Undo some waiting as it was on the wrong thread
Change 3182115 on 2016/11/01 by Rolando.Caloca
DR - hlslcc Linux path fix
Change 3182118 on 2016/11/01 by Daniel.Wright
Fixed global distance field seam artifacts from landscapes with no subsections
Change 3182368 on 2016/11/01 by Daniel.Wright
Dynamic Indirect Shadows for static meshes using distance fields
* These Distance Field indirect shadows use the same tile culled and downsampled framework that capsule shadows use, with similar GPU cost
* Individual StaticMesh assets can enable bGenerateMeshDistanceField to compute a distance field, without the memory cost of enabling for the whole project
* New StaticMeshComponent properties bCastDynamicIndirectShadow and DynamicIndirectShadowMinVisibility
* New WorldSettings property DynamicIndirectShadowsSelfShadowingIntensity which replaces the cvar
* The GBuffer now stores HasDynamicIndirectShadowCasterRepresentation instead of HasHeightfieldRepresentation
* DFAO from landscape is now done through the global distance field entirely. Landscape contribution to the global distance field is deferred to attempt to workaround texture streaming issues.
Change 3182408 on 2016/11/01 by Rolando.Caloca
DR - vk - Reworked occlusion queries, fixes flickering on AMD
Change 3182585 on 2016/11/01 by Daniel.Wright
PS4 compile fix
Change 3183151 on 2016/11/02 by Rolando.Caloca
DR - vk - Fix issue when processing super quick cmd buffers
Change 3183160 on 2016/11/02 by Rolando.Caloca
Dr - vk - Call reset queries outside render pass
Change 3183182 on 2016/11/02 by Rolando.Caloca
DR - Switch clear
Change 3183194 on 2016/11/02 by Rolando.Caloca
DR - Try to catch crash ahead of time
Change 3183268 on 2016/11/02 by Rolando.Caloca
DR - vk - Rename RenderPassState to TransitionState
Change 3183440 on 2016/11/02 by Daniel.Wright
Renamed 'Dynamic Indirect Shadow' to 'Distance Field Indirect Shadow'
Change 3183793 on 2016/11/02 by Daniel.Wright
Added ShadowResolutionScale to lightcomponent
Change 3183796 on 2016/11/02 by Daniel.Wright
Improved bSimulatePhysics comment, with info on why it might be greyed out
Change 3183797 on 2016/11/02 by Daniel.Wright
Precomputed shadowmaps no longer enable Force2To1Aspect, which is only needed for lightmaps. Improves shadowmap utilization.
Change 3183915 on 2016/11/02 by Rolando.Caloca
DR - vk - Remove redundant renderpasses
Change 3183991 on 2016/11/02 by Daniel.Wright
Added r.ReflectionEnvironmentLightmapMixLargestWeight, useful for restricting lightmap mixing to darkening only
Change 3184001 on 2016/11/02 by Daniel.Wright
Better draw event for IndirectCapsuleShadows in stereo
Change 3184096 on 2016/11/02 by Chris.Bunner
HDR for D3D11 - NVAPI toggle and encoding, UI compositing.
Removed some outdated tonemamping cvars and modes.
Change 3184399 on 2016/11/02 by Daniel.Wright
Static analysis workaround
Change 3184455 on 2016/11/02 by Mark.Satterthwaite
Fix missing log10 from FCompositePS on hlslcc shader platforms so that QA can continue their integration.
#jira UE-38164
Change 3184953 on 2016/11/03 by Chris.Bunner
Fixing CIS warnings.
[CL 3186011 by Marcus Wassmer in Main branch]
2016-11-03 16:55:27 -04:00
{
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
FGlobalDistanceFieldClipmap & Clipmap = * ( CacheType = = GDF_MostlyStatic
? & GlobalDistanceFieldInfo . MostlyStaticClipmaps [ ClipmapIndex ]
: & GlobalDistanceFieldInfo . Clipmaps [ ClipmapIndex ] ) ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3185985)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3170391 on 2016/10/21 by Ben.Woodhouse
Remove the wait on end of frame ensure, because we can't rely on all the the underlying codepaths to never miss a call to flush RHI resources. The consequences of missing a flush on a given frame are not serious now, since we enforce the synchronisation with a fence, preventing the rendering thread from getting too far ahead. We will simply accumulate resources for an additional frame when this happens.
#jira UE-37437
#fyi rolando.caloca, marcus.wassmer
Change 3170659 on 2016/10/21 by Rolando.Caloca
DR - vk - Prep work for state key changes
Change 3170676 on 2016/10/21 by Rolando.Caloca
DR - vk - Reworked blend state keys
- Added depth/stencil to pipeline key
Change 3170848 on 2016/10/21 by Daniel.Wright
Level viewport 'show stats' option is now enabled by default, which avoids confusion with artists thinking lighting is built, when really the message is hidden.
Change 3170849 on 2016/10/21 by Daniel.Wright
Split FProjectedShadowInfo::RenderProjection into smaller functions which make the algorithm structure clear
Change 3170995 on 2016/10/21 by Rolando.Caloca
DR - vk - Show object on vulkan validation msgs
Change 3171085 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix pipelines being used with incompatible renderpasses
Change 3171159 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix layout when reading textures on CPU
Change 3171167 on 2016/10/21 by Rolando.Caloca
DR - vk - compile fix
Change 3172462 on 2016/10/24 by Daniel.Wright
Added a warning about shader compile times to the material tooltip
Change 3172463 on 2016/10/24 by Daniel.Wright
Reduced MinUnoccludedFraction to avoid artitfacts when a stationary light touches only a tiny part of a mesh
Change 3172716 on 2016/10/24 by Brian.Karis
Fix for crash UE-37369 when reimporting over a generated LOD.
Change 3172967 on 2016/10/24 by Rolando.Caloca
DR - vk - Fix writing buffers while GPU was using them
Change 3174187 on 2016/10/25 by Olaf.Piesche
UE-37020
Change 3174718 on 2016/10/26 by Rolando.Caloca
DR - vk - Remove old timestamp queries, increase occlusion queries per pool to 4k
Change 3175960 on 2016/10/26 by Rolando.Caloca
DR - Added support for hlslcc header to have custom parsing
Change 3176611 on 2016/10/27 by David.Hill
DrawWireCone confusion:
In response to a UDN, I'm updating confusing parameter names and comments for
DrawWireCone() and DrawWireSphereCappedCone()
Change 3177111 on 2016/10/27 by Rolando.Caloca
DR - vk - Fix timestamps for frame
Change 3177192 on 2016/10/27 by Arne.Schober
DR - DitherLOD refactor - moved computation of the DepthStencil state out of SetMeshRenderState into GetDitheredLODTransitionState this is a prerequisite of further PSO work where we want to move up State setting in a similar war and reuse FMeshDrawingRenderState
Change 3177278 on 2016/10/27 by Olaf.Piesche
UE-37484
Change 3177297 on 2016/10/27 by Rolando.Caloca
DR - vk - Enable GRHISupportsBaseVertexIndex
Change 3177607 on 2016/10/27 by Rolando.Caloca
DR - vk - SM4 UB prep
Change 3178052 on 2016/10/28 by Arne.Schober
DR - fix WebGL - the WebGL compiler is very picky on double underscores and does want the presission to be defined before any function definition.
Change 3178156 on 2016/10/28 by Rolando.Caloca
DR - vk - Added query timer
- Fixed inline issues
Change 3178158 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for out of stencil bits
Change 3178462 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for Elemental
Change 3179131 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix for r.Vulkan.UseRealUBs
Change 3179139 on 2016/10/28 by Rolando.Caloca
DR - vk - Move UB ring buffer to context
Change 3179145 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix buffer barriers
Change 3179888 on 2016/10/31 by Rolando.Caloca
DR - vk - Align buffers to 16 bytes as we sometimes write to them with SIMD
Change 3179923 on 2016/10/31 by Rolando.Caloca
DR - vk - Wait for swapchain counter
Change 3180430 on 2016/10/31 by Rolando.Caloca
DR - vk - Properly wait for occlusion queries/cmd buffer
- Actual log error if trying to use occlusion queries out of order
Change 3180746 on 2016/10/31 by Rolando.Caloca
DR - vk - Undo some waiting as it was on the wrong thread
Change 3182115 on 2016/11/01 by Rolando.Caloca
DR - hlslcc Linux path fix
Change 3182118 on 2016/11/01 by Daniel.Wright
Fixed global distance field seam artifacts from landscapes with no subsections
Change 3182368 on 2016/11/01 by Daniel.Wright
Dynamic Indirect Shadows for static meshes using distance fields
* These Distance Field indirect shadows use the same tile culled and downsampled framework that capsule shadows use, with similar GPU cost
* Individual StaticMesh assets can enable bGenerateMeshDistanceField to compute a distance field, without the memory cost of enabling for the whole project
* New StaticMeshComponent properties bCastDynamicIndirectShadow and DynamicIndirectShadowMinVisibility
* New WorldSettings property DynamicIndirectShadowsSelfShadowingIntensity which replaces the cvar
* The GBuffer now stores HasDynamicIndirectShadowCasterRepresentation instead of HasHeightfieldRepresentation
* DFAO from landscape is now done through the global distance field entirely. Landscape contribution to the global distance field is deferred to attempt to workaround texture streaming issues.
Change 3182408 on 2016/11/01 by Rolando.Caloca
DR - vk - Reworked occlusion queries, fixes flickering on AMD
Change 3182585 on 2016/11/01 by Daniel.Wright
PS4 compile fix
Change 3183151 on 2016/11/02 by Rolando.Caloca
DR - vk - Fix issue when processing super quick cmd buffers
Change 3183160 on 2016/11/02 by Rolando.Caloca
Dr - vk - Call reset queries outside render pass
Change 3183182 on 2016/11/02 by Rolando.Caloca
DR - Switch clear
Change 3183194 on 2016/11/02 by Rolando.Caloca
DR - Try to catch crash ahead of time
Change 3183268 on 2016/11/02 by Rolando.Caloca
DR - vk - Rename RenderPassState to TransitionState
Change 3183440 on 2016/11/02 by Daniel.Wright
Renamed 'Dynamic Indirect Shadow' to 'Distance Field Indirect Shadow'
Change 3183793 on 2016/11/02 by Daniel.Wright
Added ShadowResolutionScale to lightcomponent
Change 3183796 on 2016/11/02 by Daniel.Wright
Improved bSimulatePhysics comment, with info on why it might be greyed out
Change 3183797 on 2016/11/02 by Daniel.Wright
Precomputed shadowmaps no longer enable Force2To1Aspect, which is only needed for lightmaps. Improves shadowmap utilization.
Change 3183915 on 2016/11/02 by Rolando.Caloca
DR - vk - Remove redundant renderpasses
Change 3183991 on 2016/11/02 by Daniel.Wright
Added r.ReflectionEnvironmentLightmapMixLargestWeight, useful for restricting lightmap mixing to darkening only
Change 3184001 on 2016/11/02 by Daniel.Wright
Better draw event for IndirectCapsuleShadows in stereo
Change 3184096 on 2016/11/02 by Chris.Bunner
HDR for D3D11 - NVAPI toggle and encoding, UI compositing.
Removed some outdated tonemamping cvars and modes.
Change 3184399 on 2016/11/02 by Daniel.Wright
Static analysis workaround
Change 3184455 on 2016/11/02 by Mark.Satterthwaite
Fix missing log10 from FCompositePS on hlslcc shader platforms so that QA can continue their integration.
#jira UE-38164
Change 3184953 on 2016/11/03 by Chris.Bunner
Fixing CIS warnings.
[CL 3186011 by Marcus Wassmer in Main branch]
2016-11-03 16:55:27 -04:00
2021-06-16 17:48:21 -04:00
const TArray < FRenderBounds > & PrimitiveModifiedBounds = ClipmapViewState . Cache [ CacheType ] . PrimitiveModifiedBounds ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
2021-06-16 17:48:21 -04:00
TArray < FRenderBounds , SceneRenderingAllocator > CulledPrimitiveModifiedBounds ;
2020-07-06 18:58:26 -04:00
CulledPrimitiveModifiedBounds . Empty ( ClipmapViewState . Cache [ CacheType ] . PrimitiveModifiedBounds . Num ( ) / 2 ) ;
Clipmap . UpdateBounds . Empty ( ClipmapViewState . Cache [ CacheType ] . PrimitiveModifiedBounds . Num ( ) / 2 ) ;
for ( int32 BoundsIndex = 0 ; BoundsIndex < ClipmapViewState . Cache [ CacheType ] . PrimitiveModifiedBounds . Num ( ) ; BoundsIndex + + )
{
2021-06-16 17:48:21 -04:00
const FRenderBounds PrimBounds = ClipmapViewState . Cache [ CacheType ] . PrimitiveModifiedBounds [ BoundsIndex ] ;
2022-02-02 07:59:31 -05:00
const FVector PrimWorldCenter = ( FVector ) PrimBounds . GetCenter ( ) ;
const FVector PrimWorldExtent = ( FVector ) PrimBounds . GetExtent ( ) ;
2020-07-06 18:58:26 -04:00
const FBox ModifiedBounds ( PrimWorldCenter - PrimWorldExtent , PrimWorldCenter + PrimWorldExtent ) ;
2020-09-08 17:44:06 -04:00
if ( ModifiedBounds . ComputeSquaredDistanceToBox ( ClipmapBounds ) < ClipmapInfluenceRadius * ClipmapInfluenceRadius )
2020-07-06 18:58:26 -04:00
{
CulledPrimitiveModifiedBounds . Add ( ModifiedBounds ) ;
Clipmap . UpdateBounds . Add ( FClipmapUpdateBounds ( ModifiedBounds . GetCenter ( ) , ModifiedBounds . GetExtent ( ) , true ) ) ;
if ( GAODrawGlobalDistanceFieldModifiedPrimitives )
{
const uint8 MarkerHue = ( ( ClipmapIndex * 10 + BoundsIndex ) * 10 ) & 0xFF ;
const uint8 MarkerSaturation = 0xFF ;
const uint8 MarkerValue = 0xFF ;
FLinearColor MarkerColor = FLinearColor : : MakeFromHSV8 ( MarkerHue , MarkerSaturation , MarkerValue ) ;
MarkerColor . A = 0.5f ;
DrawWireBox ( & ViewPDI , ModifiedBounds , MarkerColor , SDPG_World ) ;
}
}
}
if ( bUsePartialUpdates )
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3219450)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3148067 on 2016/10/01 by Daniel.Wright
Support for ReflectionEnvironment and light type show flags with ForwardShading
Change 3149085 on 2016/10/03 by Daniel.Wright
Support for ReflectionEnvironment show flag in base pass reflections without any shader overhead
Change 3162206 on 2016/10/13 by Chris.Bunner
Merging Dev-MaterialLayers to Dev-Rendering, CL 3161593:
Material expressions; Trig, fast-trig, saturate, round, truncate, pre-skinned normal.
Added CustomEyeTangent to material attributes.
Resolved some hard-coded attribute typing and other minor fixes.
Change 3186067 on 2016/11/03 by Daniel.Wright
Updated Stationary primitive tooltip to indicate that it allows the primitive to be changed, but not moved
Change 3186069 on 2016/11/03 by Daniel.Wright
Using a weighted geometric mean to combine multiple Distance Field Indirect Shadows, greatly reduces over-occlusion when overlap is high
Change 3186084 on 2016/11/03 by Mark.Satterthwaite
Duplicate 3172511:
Don't set Metal resource option fields on texture descriptors when running on an OS that doesn't support them.
#jira UE-37481
Change 3186089 on 2016/11/03 by Mark.Satterthwaite
Duplicate CL #3169764:
Fixed automatic conversion of G8_sRGB into RGBA8_sRGB required for Mac Metal, which fixes FORT-27627.
#jira FORT-27627
Change 3186113 on 2016/11/03 by Mark.Satterthwaite
Duplicate CL #3183807:
Change the way we access the Metal viewport's backbuffer, to reduce possible causes of FORT-31649:
- Added console variable "rhi.Metal.SupportsIntermediateBackBuffer" to control whether to use an extra render-target so we can support screenshots & movie capture, or render directly to the back-buffer to save memory & GPU performance. Still defaults to ON for Mac & OFF for iOS/tvOS.
- Change the way we handle updates to the back-buffer size to ensure that the different threads access their intended version.
#jira FORT-31649
Change 3186116 on 2016/11/03 by Mark.Satterthwaite
Duplicate CL #3183823:
Record Metal resource & state objects used in a command-buffer when rhi.Metal.RuntimeDebugLevel is set to 3 or higher. The object labels, types & descriptions will be printed on failure - if the object is deleted prior to this then we have a lifetime error and it will crash at this point and can be debugged further using our -metalretainrefs command-line option or Xcode's zombie-objects.
Used to verify that FORT-31649 is not a simple resource lifetime error and thereby speed up Apple/vendor investigations.
#jira FORT-31649
Change 3186818 on 2016/11/04 by Chris.Bunner
PR #2907 Export UMaterialExpressionNoise (contributed by kayosiii).
Change 3186979 on 2016/11/04 by Rolando.Caloca
DR - Misc minor cleanup
Change 3187169 on 2016/11/04 by Uriel.Doyon
Incremental insertion of level data between PostLoad and AddToWorld
Change 3187205 on 2016/11/04 by Mark.Satterthwaite
Compile fixes for iOS.
Change 3187389 on 2016/11/04 by Uriel.Doyon
Fix for possible stall when loading hidden level
Change 3187598 on 2016/11/04 by Michael.Trepka
MetalViewport compile fix
Change 3187678 on 2016/11/04 by Uriel.Doyon
Fix for landscape grass textures not being streamed in correctly.
Change 3187731 on 2016/11/04 by Rolando.Caloca
DR - Start making type safe some cross compiler enums
Change 3187824 on 2016/11/04 by Rolando.Caloca
DR - clang compile fix
Change 3187953 on 2016/11/04 by Rolando.Caloca
DR - vk - Mac compile fix
Change 3188696 on 2016/11/07 by Mark.Satterthwaite
Another iOS compile fix for new MetalViewport validation code.
Change 3188906 on 2016/11/07 by Rolando.Caloca
DR - Show permutation of LUTBlender
Change 3189094 on 2016/11/07 by Chris.Bunner
Fix RemoveAAJitter from projection matrix.
#jira UE-37701, UE-38003
Change 3189134 on 2016/11/07 by Daniel.Wright
Fix for CreateRenderTarget2D called in construction script during cooking
Change 3189145 on 2016/11/07 by Chris.Bunner
Follow-up to CL 3186818, export UMaterialExpressionVectorNoise.
Change 3189239 on 2016/11/07 by Daniel.Wright
Added show flag for Contact Shadows, disabled in planar reflections
Change 3189252 on 2016/11/07 by Daniel.Wright
Support for Reflection Capture intensity with simple reflections, which are the default with Forward Shading
Change 3189406 on 2016/11/07 by Mark.Satterthwaite
Really fix the last of the iOS compile errors from changes to the MetalViewport code.
Change 3190854 on 2016/11/08 by Ben.Woodhouse
XB1: Fix memory corruption with RHICreateVertexBuffer and RHICreateIndexBuffer when using initial data (Procedural Mesh Component crash)
#jira UE-34264
#fyi james.golding
#fyi keith.judge
Change 3190962 on 2016/11/08 by Olaf.Piesche
Unshelved from pending changelist '3176615' - Gil's fix for race condiiton with particle vertex factory reuse across different passes; potential to fix a number of issues
Change 3191959 on 2016/11/09 by Uriel.Doyon
Removed some static primitives from the dynamic primitive handler for texture streaming.
Change 3193122 on 2016/11/10 by Chris.Bunner
Always update non-preview material resources for use in code preview.
#jira UE-38223
Change 3193190 on 2016/11/10 by Gil.Gribb
UE4 - Fixed rare bug with shadow groups rendering things that have not been setup to render this frame.
#jira UE-36379
Change 3193523 on 2016/11/10 by Uriel.Doyon
Fixed incorrect section bounds used for texture streaming.
Change 3193962 on 2016/11/10 by Uriel.Doyon
Added defrag of dynamic bounds used for the texture streaming. Allows to remove unused bounds over time.
Change 3193974 on 2016/11/10 by Uriel.Doyon
New "Required Texture Resolution" view mode. Showing the ratio between the currently streamed texture resolution and the resolution wanted by the GPU.
Change 3194109 on 2016/11/10 by Uriel.Doyon
Another patch on material bounds used for texture streaming.
Change 3194665 on 2016/11/11 by Chris.Bunner
Duplicated behavior for inherited velocity scaling scaling to vert/surface spawned particles.
Change 3194734 on 2016/11/11 by Rolando.Caloca
DR - vk - Simplified some texture casting
Change 3194867 on 2016/11/11 by Rolando.Caloca
DR - vk - SM5 fixes
Change 3195176 on 2016/11/11 by Chris.Bunner
Fixed incorrectly updated NVAPI error.
Change 3195425 on 2016/11/11 by Uriel.Doyon
Fixed possible invalid level reference in the texture streamer
Change 3196512 on 2016/11/14 by Gil.Gribb
Merging //UE4/Dev-Main@3196156 to Dev-Rendering (//UE4/Dev-Rendering)
Change 3196750 on 2016/11/14 by Marcus.Wassmer
Fix ordering problem with GPU cache transitions
Change 3196815 on 2016/11/14 by Daniel.Wright
Suppressed 'Instanced stereo rendering is not supported' warning showing up in CIS
Change 3196818 on 2016/11/14 by Daniel.Wright
Fixed FIndirectLightingCache::UpdateCachePrimitivesInternal churning through a bunch of temporary memory
Change 3196819 on 2016/11/14 by Daniel.Wright
Volume lighting samples are allowed outside of the importance volume if their influence affects the volume. Fixes black indirect lighting on movable components in maps with small importance volumes.
Volume lighting samples placed on surfaces use a radius that covers the layer height spacing, which prevents an uncovered region between layers
Change 3197243 on 2016/11/14 by Uriel.Doyon
Async Task For Updating static component LastRender time
#jira UE-24268
Change 3197359 on 2016/11/14 by Daniel.Wright
Added Inscattering Texture controls to ExponentialHeightFog
* When InscatteringColorCubemap is specified, directional light inscattering is disabled
* Lerps betwen 1x1 mip at NonDirectionalInscatteringColorDistance to mip 0 at FullyDirectionalInscatteringColorDistance
* Added FogCutoffDistance, so artists can prevent fog on skyboxes (requires fog to be setup matching the fog that was rendered into the sky texture so that distant mountains match)
* Fog shader permutations based on what feature is enabled
Change 3198419 on 2016/11/15 by Chris.Bunner
PS4 HDR: Runtime toggle (backbuffer recreation on resize matching), UI composition. Matches PC behavior and controls.
HDR: Generalized buffer formats, cvar consistency pass, LUT for UI composition, refactoring common functions.
Exposed RHICreateTargetableShaderResource3D.
Moved some (translucent) volume rendering helpers to allow access in Slate.
Change 3198822 on 2016/11/15 by Daniel.Wright
Mac compile fix
Change 3199509 on 2016/11/15 by Uriel.Doyon
Added support for viewmode param asset name (and note just param value).
Used to investigate texture streamer behavior.
Change 3199578 on 2016/11/15 by Rolando.Caloca
DR - Add some shader resource tables to SCW when running with -directcompile
Change 3199698 on 2016/11/15 by Rolando.Caloca
DR - vk - Refactor shader & descriptor bindings
Change 3199712 on 2016/11/15 by Rolando.Caloca
DR - vk - r.Vulkan.StripGlsl to always strip glsl at runtime to save memory per shader
Change 3199717 on 2016/11/15 by Rolando.Caloca
DR - vk - Show hitching PSO info again
Change 3199750 on 2016/11/15 by Rolando.Caloca
DR - SCW clang compile fixes
Change 3200353 on 2016/11/16 by Rolando.Caloca
DR - vk - Mac fix
Change 3200358 on 2016/11/16 by Chris.Bunner
Only allow UI composition on platforms we currently use it.
Change 3200823 on 2016/11/16 by Chris.Bunner
Remove expression key attribute ID when not translating an attribute output to allow intended expression sharing.
#jira UE-38699
Change 3200947 on 2016/11/16 by Mark.Satterthwaite
Fix UE-38695 by not trying to resize the viewport on the wrong thread.
#jira UE-38695
Change 3201069 on 2016/11/16 by Daniel.Wright
Fog inscattering texture limited to SM4 and above, fixes ES2 compile errors
Change 3201346 on 2016/11/16 by Brian.Karis
Temporal AA fix for correct edge gradients.
Filtering now combined with importance sampling.
Enabled Catmull-Rom resolve filter. Results are now slightly sharper.
Fixed antighosting. Will yet require a dilation to be perfect.
Optimized bicubic filtering to 5 taps instead of 9.
Cleaned out unused code.
Change 3201369 on 2016/11/16 by Brian.Karis
Bicubic texture sample
Change 3201522 on 2016/11/16 by Rolando.Caloca
DR - vk - Fix static analysis issues
Change 3201878 on 2016/11/17 by Chris.Bunner
Temporarily disable Nvapi HDR error logging.
#jira UE-38529
Change 3202108 on 2016/11/17 by Simon.Tovey
Assets with easy repro for flickering particles bug
Change 3202181 on 2016/11/17 by Rolando.Caloca
DR - vk - CIS android fix
Change 3202325 on 2016/11/17 by Ben.Woodhouse
Integrate 4.14.1 fix from 14 //UE4/Release-4.14 (@3201850)
Fix CreateVertexbuffer and CreateIndexBuffer memory corruption (Procedural Mesh Component crash)
#jira UE-34264
Change 3204394 on 2016/11/18 by Guillaume.Abadie
PR #2808: AlphaComposite Fog Opacity fix (Contributed by moritz-wundke)
#br Ben.Woodhouse
Change 3204428 on 2016/11/18 by Guillaume.Abadie
Fixes a couple of issues in decals:
* Crash in FDecalDrawingPolicyFactory::DrawMesh()
* ActorPostion material expression
* PixelNormalWS material expression
* Missing renaming from DEFERRED_DECAL to DECAL_PRIMITIVE
#jira UE-38327, UE-38158, UE-37818, UE-37350
Change 3204429 on 2016/11/18 by Uriel.Doyon
Darker default undefined accuracy.
Reenabled the texture streaming build in the build all.
Change 3204458 on 2016/11/18 by Chris.Bunner
Shader truncation warnings fix.
Change 3204459 on 2016/11/18 by Chris.Bunner
Engine 'Passthrough' material fuction fix. V4 is now actually a V4.
Change 3204460 on 2016/11/18 by Chris.Bunner
Correctly handle some known Nvapi warnings.
#jira UE-38529
Change 3204653 on 2016/11/18 by Marc.Olano
Helper functions for tiled textures
Checking in for Ryan Brucks
Change 3204863 on 2016/11/18 by Arne.Schober
DR - Replaced ENQUEUE_UNIQUE_RENDER_COMMAND with a Debuggable template Implementation
Change 3204939 on 2016/11/18 by Arne.Schober
DR - Make clang happy
Change 3204968 on 2016/11/18 by Arne.Schober
DR - UE-38494 - Fixed SpeedTree Wind crash, when force deleting the Asset.
Change 3206293 on 2016/11/21 by Uriel.Doyon
New member bHasStreamingUpdatePending in UTexture2D to delay update of global distance fields.
Set to true when the streamer can possibly load a mip in the near future.
#jira UE-37787
Change 3206551 on 2016/11/21 by Chris.Bunner
Added material update context when forcing all shaders to recompile.
#jira UE-38481
Change 3206644 on 2016/11/21 by Benjamin.Hyder
Updating Planar Reflection example in TM-Shadermodels.
Change 3206899 on 2016/11/21 by Rolando.Caloca
DR - vk - SM5 fixes
Change 3206900 on 2016/11/21 by Rolando.Caloca
DR - Added missing strings for shader formats
Change 3206983 on 2016/11/21 by Rolando.Caloca
DR - vk - Support for SV_Coverage
Change 3207237 on 2016/11/22 by Simon.Tovey
Exporting particle module base and a couple of child classes as it's commonly requested.
#test compiles
Change 3207241 on 2016/11/22 by Gil.Gribb
Merging //UE4/Dev-Main@3206998 to Dev-Rendering (//UE4/Dev-Rendering)
Change 3207520 on 2016/11/22 by Ben.Woodhouse
Cherry picked from //Fortnite/Main@3206301
Fixed GPU hang in Zone Map view. Was an issue with RenderThread using the device context without appropriate RHIThread flushes.
#jira FORT-31616
#code_review keith.judge
Change 3207541 on 2016/11/22 by Ben.Woodhouse
Cherry picked from //fortnite/Main@3207422
* Fix UpdateTexture3D to create a staging texture of the region to update rather than the whole texture. This prevents distance fields crashing during update (allocating 18GB per frame in some cases)
* Put UpdateTexture2D DMA support onto a cvar, disabled by default (corruption issues reported by licensees, plus not sure it's actually faster - could be slower due to reduced bandwidth; issues reported by licensees)
* Fix UpdateTexture2D to only create a staging texture of the region to update, saving memory
#jira UE-38609
Change 3207654 on 2016/11/22 by Chris.Bunner
Don't flag 16-bit PNG/JPG textures as sRGB on import.
#jira UE-30279
Change 3208434 on 2016/11/22 by Rolando.Caloca
DR - vk - UAV transitions
Change 3208490 on 2016/11/22 by Chris.Bunner
Break material code sharing when we detect an unresolvable loop.
By default change IsResultMA loop detection to stop on functions as we can determine type definitively.
Unified IsResultMA detection across switch nodes.
Change 3208860 on 2016/11/23 by Rolando.Caloca
DR - vk - Fix some format issues
Change 3209265 on 2016/11/23 by Arne.Schober
DR - originally unshelved from 3153924 - Made Depth and Velocity Rendering Passes to use PSO only RHI interface,
We are now passing down two structs that collect all the necessary information for the drawing policies to construct a PSO object.
One during construction of the Policy, which contains information abouyt the CullMode, FillMode and PrimType.
And another during rendering that passes infomation like BlendState and DepthStencilState down to the low levelrenderer into SetSharedState.
Performance of the static drawlist ist slightly slower (less than 0.1ms on Consoles) due to some addtional branches and copies. The branches in the FDrawingPolicyRenderState will go away as soon as everything is converted to use the PSO interface.
Performace of the GPU is slightly better due to less context rolls (mainly CullMode sorts in differently now)
Change 3209305 on 2016/11/23 by Guillaume.Abadie
Fix contact shadow's assemption on objects thickness
Change 3209334 on 2016/11/23 by Brian.Karis
Fixed TAA handling of alpha. Switched the meaning of AA_ALPHA to make sense.
Change 3209903 on 2016/11/24 by Guillaume.Abadie
Cherry picks alpha through post processing changelists 3201959, 3204143 and 3209883 from //UE4/Private-Partner-NREAL
Change 3209973 on 2016/11/24 by Ben.Woodhouse
Fix D3D11 and 12 static analysis warnings reported by Rob Troughton of Coconut Lizard (http://coconutlizard.co.uk/blog/ue4/pvs-studio-part5/)
Change 3210023 on 2016/11/24 by Uriel.Doyon
Fixed an issue with DropDetail when FixedFrameRate was set to a value smaller than MinDesiredFrameRate.
#jira UE-37210
Change 3210026 on 2016/11/24 by Ben.Woodhouse
Disable renderthread hang detection if a debugger is present, so we can debug the renderthread without crashing
Change 3210049 on 2016/11/24 by Ben.Woodhouse
Fix mac build
Change 3210071 on 2016/11/24 by Uriel.Doyon
Fixed an issue with masked materials and shader complexity viewmode when DBuffer Decals are enabled.
#jira UE-37542
Change 3210374 on 2016/11/25 by Ben.Woodhouse
* Fix issues with fast cleared dbuffer targets not being resolved when no decals are in the scene. This caused graphical corruption on XB1 and ensure failures on PS4 (with RHIThread disabled)
* Move Decal rendertarget manager function implementations out of the header.
#jira UE-38879
Change 3210390 on 2016/11/25 by Uriel.Doyon
Fixed cubemap resourcesize not taking into account mipgen settings
#jira UE-37045
Change 3210407 on 2016/11/25 by Uriel.Doyon
"resavepackages" commandlet now supports -buildtexturestreaming that rebuilds the map texture streaming data.
That can be used in combination with -buildlighting.
Change 3210563 on 2016/11/27 by Rolando.Caloca
DR - vk - Integrate cached memory fixes and PF_D24 format fix
#jira UE-39025
PR #2974
Change 3210564 on 2016/11/27 by Rolando.Caloca
DR - Fix for GL linker
PR #2975
#jira UE-39029
Change 3210592 on 2016/11/27 by Rolando.Caloca
DR - vk - SM5 fixes
Change 3210597 on 2016/11/27 by Rolando.Caloca
DR - vk - Prep for staging UB copies to GPU memory
Change 3210600 on 2016/11/27 by Rolando.Caloca
DR - vk - Extract generic range code
Change 3210613 on 2016/11/27 by Rolando.Caloca
DR - vk - Added r.Vulkan.SubmitOnDispatch
Change 3211054 on 2016/11/28 by Rolando.Caloca
DR - vk - Missing reference
Change 3211330 on 2016/11/28 by Chris.Bunner
Shader compile error for max texture coordinate count on skinned meshes.
Change 3211384 on 2016/11/28 by Arne.Schober
DR - Enforce move on EnqueueRenderCommand Lambda
Change 3211431 on 2016/11/28 by Gil.Gribb
Merging //UE4/Dev-Main@3211016 to Dev-Rendering (//UE4/Dev-Rendering)
Change 3211738 on 2016/11/28 by Gil.Gribb
IWYU fixes after merge
Change 3212231 on 2016/11/28 by Richard.Wallis
Fix build errors
Change 3212253 on 2016/11/28 by Richard.Wallis
Remove MacGraphicsSwitching plugin.
#jira UE-37640
Change 3212310 on 2016/11/28 by Rolando.Caloca
DR - vk - Update glslang to 1.0.33.0
Change 3212446 on 2016/11/28 by Guillaume.Abadie
Implements PreviousFrameSwitch material expression
Change 3212594 on 2016/11/28 by Arne.Schober
DR - Fix missing include
Change 3212681 on 2016/11/29 by Rolando.Caloca
DR - vk - Auto flush for compute shader
Change 3213000 on 2016/11/29 by Gil.Gribb
temp fix for PF_MAX
Change 3213161 on 2016/11/29 by Ben.Woodhouse
Integrate latest D3D12 changes from //depot/Partners/Microsoft/UE4-DX12/...@3211714
Using:
- p4 integrate //depot/Partners/Microsoft/UE4-DX12/Engine/Source/Runtime/D3D12RHI/...@3211714 //UE4/Dev-Rendering/Engine/Source/Runtime/D3D12RHI/...
- p4 integrate //depot/Partners/Microsoft/UE4-DX12/Engine/Source/ThirdParty/Windows/DirectX/...@3211714 //UE4/Dev-Rendering/Engine/Source/ThirdParty/Windows/DirectX/...
- p4 integrate //depot/Partners/Microsoft/UE4-DX12/Engine/Source/Programs/UnrealBuildTool/...@3211714 //UE4/Dev-Rendering/Engine/Source/Programs/UnrealBuildTool/...
Changes from UE4-DX12:
*** CL 3183818 ***
Update D3D12 RHI to 4.14:
- Merged changes from Epic up until 10/20/16
- Fixed an issue where command allocators where resetting too early. I changed to aggressive command list batching by default now that more SubmitCommandListHint calls exist in the upper engine, we don't need to worry about starving the GPU. Fewer ExecuteCommandLists calls means better performance and fewer Signals() so this change provides a GPU perf win.
I had to fix an issue with aggressive batching where we would sometimes hold on to a command list long enough (in the pending list) but hadn't executed it yet. The command allocator was being put back in the queue of allocators during ReleaseCommandAllocator() without a syncpoint set and was thus being reset too early. I added a simple counter to the command allocator so it could track how many command lists were using it. It doesn't need to be thread safe since only one thread uses a command allocator at a time.
I also added some stats around the # command lists and # command allocators since it would be possible to leak command allocators now if it's pending command list count isn't decremented correctly. In that case we'd keep creating new command allocators and eventually run out of memory.
-Remove clear during allocate in the FD3D12FastConstantAllocator and FD3D12FastAllocator. The supplied resource locations are assumed to be new and thus don't need to be cleared.
-Cleanup D3D12RHI stats. There were some unused stats as well as some missing ones.
-Mark shader resource table uniform buffers as dirty only when the shader changes. Cleanup SetComputeShader calls and Dispatch calls to not set/unset the CS for each Dispatch.
-Remove unused Check SRV resolved code that epic added to the D3D11 RHI and was brought over. We dont need it and we won't use this.
-Remove "always on" cycle counters for high frequency RHI methods like RHISetShaderTexture. These should use the engine's stat macros as they are removed on TEST + SHIPPING builds. On Xbox a significant amount of CPU time is spent in things like QueryPerformanceCounter even when STATS aren't enabled. Currently 1% of an entire capture on XBOX is spent inside this call.
I improved and cleaned up high freqency call stacks like:
- RHISetShaderTexture
- RHISetShaderResourceViewParameter
- RHISetShaderParameter
- RHISetUAVParameter
In general I moved to use templated functions, removed unused parameters, unnecessary copies, etc.
-Change D3D12 PIX profiling enable/disable to match Xbox and handle logic in the UEBuildWindows.cs for UBT. Also add a static assert to inform the developer when PIX profiling is requested but the engine is compiling out draw events.
-Resources should be associated with the rendering thread's frame that it's currently recording command lists for and they shouldnt be cleaned up until those command lists have been translated to D3D12 command lists on the RHI thread AND completed executing on the GPU. This was confirmed to resolve an issue where CBV resources were being released too early.
This work involved a couple changes:
1) Move the "frame" fence to be incremented on the rendering thread (during RHIAdvanceFrameForGetViewportBackBuffer()) so that resources that are deleted from the rendering thread are assosicated with the correct frame count
2) Queue up a command from the rendering thread to signal the "frame" fence. It needs to be queued to ensure that it's signaled at the correct time on the RHI thread (after that frame's command lists have been executed).
-Disable GRHIRequiresEarlyBackBufferRenderTarget. Metal/Vulkan/Xbox11.x already do this. This is used by the Slate renderer during BeginRenderFrame and avoids a SetRenderTargets call.
-Enable GRHISupportsMSAADepthSampleAccess (used in the Editor). This was enabled for D3D11 on SM5, but not for D3D12.
-Delay load D3D12.dll and add root signature 1.1 support.
-Add explicit flush calls to improve resource barrier batching instead of implict flushes inside FConditionalScopeResourceBarrier and FScopeResourceBarrier. Also update those classes with const members.
*** CL 3183824 ***
Fix the D3D12 RHI after integrating UE 4.14 updates:
- Fixed a bug where we would try to get the PSO of a nullptr in SetPipelineState if we needed to reset the current PSO on the cmd list.
- Fixed a spelling error
- Removed the need for bForceState, we use dirty bits now
*** CL 3183830 ***
- GetDebugFlags RHI extension, needed by XB1 movie player.
- Only query memory info if stats are enabled
- Add support for the engine's new RHISubmitCommandsAndFlushGPU function
- Update CommitPendingPipelineState to be Graphics/Compute specific and avoid the need for a IsCompute parameter.
*** CL 3183837 ***
Made PipelineState caches contain pointers to FD3D12PipelineState objects to avoid issues with using pointers to after Find/Add to the maps. TMap indicates that the pointer to the value associated with a key "is only valid until the next change to any key in the map." The lifetime of the PSO pointers is managed by the low level caches (graphics and compute). Added stat for the number of Pipeline State Objects.
*** CL 3183931 ***
Update Windows D3D12 headers and libs to RS1 release bits (10.0.14393.0)
*** CL 3183978 ***
Update UBT Windows build settings:
- Change D3D12 PIX profiling enable/disable to match Xbox and handle logic in the UEBuildWindows.cs for UBT. Also add a static assert to inform the developer when PIX profiling is requested but the engine is compiling out draw events.
-Delay load D3D12.dll and add root signature 1.1 support.
*** CL 3184132 ***
Fix Xbox PSO cache code where it could leak PSOs. Related to change 3183837.
*** Changelist 3211714 ***
Update D3D12 RHI with fixes:
- Check if we can reserve slots in GatherUniqueSamplerTables
- DirtyState more often in StateCache
- Remove InternalSetSamplerState. The alternate function isn't used.
- Allow MRTClear for arrays with holes in them
- Fix uninitialized descriptors. This was causing a GPU hang on Xbox. We need to set dirty bits for resources bound to slots outside of the current descriptor table's range
- Cleanup SetDescriptorHeap code. Move setting descriptor heap logic to the descriptor cache since it also owns things like the sampler maps. Added members to the descriptor cache to track the last heaps that were set on the command list to avoid dirtying bit unnecessarily.
- Resource transitions: go through Common between queues (3D <--> Compute)
- Fix initial state for placed resources.
- Merging epic
Change 3213250 on 2016/11/29 by Chris.Bunner
GBufferHints tooltip fix.
#jira UE-39103
Change 3213345 on 2016/11/29 by Gil.Gribb
more IWYU fallout
Change 3213676 on 2016/11/29 by Rolando.Caloca
DR - Fix incorrect texture getting cleared
Change 3213728 on 2016/11/29 by Rolando.Caloca
DR - Lambda-ize
Change 3214461 on 2016/11/29 by Ben.Woodhouse
Rollout August QFE4 XDK (required for latest DX12 changes on XB1)
Change 3215317 on 2016/11/30 by Daniel.Wright
PS4 compile fix
Change 3216343 on 2016/11/30 by Arne.Schober
DR - UE-39155 - after talking to Brian it occurred to us that flipping the world space normal is non sensical. And indeed the Grass was using world space normals.
Change 3216844 on 2016/12/01 by Ben.Woodhouse
Fix for static analysis warnings after discussion with Microsoft
Change 3216916 on 2016/12/01 by Gil.Gribb
Merging //UE4/Dev-Main@3216539 to Dev-Rendering (//UE4/Dev-Rendering)
Change 3217385 on 2016/12/01 by Arne.Schober
DR - UE-39218, UE-39221, UE-39224 and potentially UE-39214 - The Stencil bits for Light channels and decal application were not set in the dynamic basepass
Change 3217464 on 2016/12/01 by Ben.Woodhouse
Fix for reflection capture resize assert. The assert is only valid in cooked builds, so disable it in editor
#jira UE-39225
Change 3217534 on 2016/12/01 by Arne.Schober
DR - Fix Merge conflict
Change 3217581 on 2016/12/01 by Rolando.Caloca
DR - Fix assert on debug
Change 3217741 on 2016/12/01 by Benjamin.Hyder
Duplicate audio fix.
Change 3217890 on 2016/12/01 by Rolando.Caloca
DR - Fix widget not rendering properly when hidden
#jira UE-39221
Change 3218129 on 2016/12/01 by Arne.Schober
DR - UE-39214 - Lod dither value as accidently cached accross the static draw list.
Change 3218759 on 2016/12/02 by Guillaume.Abadie
Fixes editor compositing bug caused by alpha through post processing change 3209903
#jira UE-39221
[CL 3219854 by Marcus Wassmer in Main branch]
2016-12-02 16:43:04 -05:00
{
2020-09-08 17:44:06 -04:00
FIntVector MovementInPages = PageGridCenter - ClipmapViewState . LastPartialUpdateOriginInPages ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
2020-07-06 18:58:26 -04:00
if ( GAOGlobalDistanceFieldForceMovementUpdate ! = 0 )
{
2020-09-08 17:44:06 -04:00
MovementInPages = FIntVector ( GAOGlobalDistanceFieldForceMovementUpdate , GAOGlobalDistanceFieldForceMovementUpdate , GAOGlobalDistanceFieldForceMovementUpdate ) ;
2020-07-06 18:58:26 -04:00
}
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
if ( CacheType = = GDF_MostlyStatic | | ! GAOGlobalDistanceFieldCacheMostlyStaticSeparately )
{
// Add an update region for each potential axis of camera movement
2020-09-08 17:44:06 -04:00
AddUpdateBoundsForAxis ( MovementInPages , ClipmapBounds , ClipmapPageSize , 0 , Clipmap . UpdateBounds ) ;
AddUpdateBoundsForAxis ( MovementInPages , ClipmapBounds , ClipmapPageSize , 1 , Clipmap . UpdateBounds ) ;
AddUpdateBoundsForAxis ( MovementInPages , ClipmapBounds , ClipmapPageSize , 2 , Clipmap . UpdateBounds ) ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
}
else
{
// Inherit from parent
2020-07-06 18:58:26 -04:00
Clipmap . UpdateBounds . Append ( GlobalDistanceFieldInfo . MostlyStaticClipmaps [ ClipmapIndex ] . UpdateBounds ) ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
}
2020-07-06 18:58:26 -04:00
}
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
2020-07-06 18:58:26 -04:00
// Only use partial updates with small numbers of primitive modifications
2020-09-08 17:44:06 -04:00
bool bUsePartialUpdatesForUpdateBounds = bUsePartialUpdates & & CulledPrimitiveModifiedBounds . Num ( ) < 1024 ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
2020-07-06 18:58:26 -04:00
if ( ! bUsePartialUpdatesForUpdateBounds )
{
Clipmap . UpdateBounds . Reset ( ) ;
Clipmap . UpdateBounds . Add ( FClipmapUpdateBounds ( ClipmapBounds . GetCenter ( ) , ClipmapBounds . GetExtent ( ) , false ) ) ;
}
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
// Check if the clipmap intersects with a pending update region
bool bHasPendingStreaming = false ;
for ( const FBox & HeightfieldBox : PendingStreamingHeightfieldBoxes )
{
if ( ClipmapBounds . Intersect ( HeightfieldBox ) )
{
bHasPendingStreaming = true ;
break ;
}
}
// If some of the height fields has pending streaming regions, postpone a full update.
if ( bHasPendingStreaming )
{
// Mark a pending update for this height field. It will get processed when all pending texture streaming affecting it will be completed.
View . ViewState - > DeferredGlobalDistanceFieldUpdates [ CacheType ] . AddUnique ( ClipmapIndex ) ;
}
else if ( View . ViewState - > DeferredGlobalDistanceFieldUpdates [ CacheType ] . Remove ( ClipmapIndex ) > 0 )
{
2020-09-08 17:44:06 -04:00
// Push full update
Clipmap . UpdateBounds . Reset ( ) ;
Clipmap . UpdateBounds . Add ( FClipmapUpdateBounds ( ClipmapBounds . GetCenter ( ) , ClipmapBounds . GetExtent ( ) , false ) ) ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
}
2021-05-28 11:26:14 -04:00
ClipmapViewState . Cache [ CacheType ] . PrimitiveModifiedBounds . Empty ( DistanceField : : MinPrimitiveModifiedBoundsAllocation ) ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3219450)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3148067 on 2016/10/01 by Daniel.Wright
Support for ReflectionEnvironment and light type show flags with ForwardShading
Change 3149085 on 2016/10/03 by Daniel.Wright
Support for ReflectionEnvironment show flag in base pass reflections without any shader overhead
Change 3162206 on 2016/10/13 by Chris.Bunner
Merging Dev-MaterialLayers to Dev-Rendering, CL 3161593:
Material expressions; Trig, fast-trig, saturate, round, truncate, pre-skinned normal.
Added CustomEyeTangent to material attributes.
Resolved some hard-coded attribute typing and other minor fixes.
Change 3186067 on 2016/11/03 by Daniel.Wright
Updated Stationary primitive tooltip to indicate that it allows the primitive to be changed, but not moved
Change 3186069 on 2016/11/03 by Daniel.Wright
Using a weighted geometric mean to combine multiple Distance Field Indirect Shadows, greatly reduces over-occlusion when overlap is high
Change 3186084 on 2016/11/03 by Mark.Satterthwaite
Duplicate 3172511:
Don't set Metal resource option fields on texture descriptors when running on an OS that doesn't support them.
#jira UE-37481
Change 3186089 on 2016/11/03 by Mark.Satterthwaite
Duplicate CL #3169764:
Fixed automatic conversion of G8_sRGB into RGBA8_sRGB required for Mac Metal, which fixes FORT-27627.
#jira FORT-27627
Change 3186113 on 2016/11/03 by Mark.Satterthwaite
Duplicate CL #3183807:
Change the way we access the Metal viewport's backbuffer, to reduce possible causes of FORT-31649:
- Added console variable "rhi.Metal.SupportsIntermediateBackBuffer" to control whether to use an extra render-target so we can support screenshots & movie capture, or render directly to the back-buffer to save memory & GPU performance. Still defaults to ON for Mac & OFF for iOS/tvOS.
- Change the way we handle updates to the back-buffer size to ensure that the different threads access their intended version.
#jira FORT-31649
Change 3186116 on 2016/11/03 by Mark.Satterthwaite
Duplicate CL #3183823:
Record Metal resource & state objects used in a command-buffer when rhi.Metal.RuntimeDebugLevel is set to 3 or higher. The object labels, types & descriptions will be printed on failure - if the object is deleted prior to this then we have a lifetime error and it will crash at this point and can be debugged further using our -metalretainrefs command-line option or Xcode's zombie-objects.
Used to verify that FORT-31649 is not a simple resource lifetime error and thereby speed up Apple/vendor investigations.
#jira FORT-31649
Change 3186818 on 2016/11/04 by Chris.Bunner
PR #2907 Export UMaterialExpressionNoise (contributed by kayosiii).
Change 3186979 on 2016/11/04 by Rolando.Caloca
DR - Misc minor cleanup
Change 3187169 on 2016/11/04 by Uriel.Doyon
Incremental insertion of level data between PostLoad and AddToWorld
Change 3187205 on 2016/11/04 by Mark.Satterthwaite
Compile fixes for iOS.
Change 3187389 on 2016/11/04 by Uriel.Doyon
Fix for possible stall when loading hidden level
Change 3187598 on 2016/11/04 by Michael.Trepka
MetalViewport compile fix
Change 3187678 on 2016/11/04 by Uriel.Doyon
Fix for landscape grass textures not being streamed in correctly.
Change 3187731 on 2016/11/04 by Rolando.Caloca
DR - Start making type safe some cross compiler enums
Change 3187824 on 2016/11/04 by Rolando.Caloca
DR - clang compile fix
Change 3187953 on 2016/11/04 by Rolando.Caloca
DR - vk - Mac compile fix
Change 3188696 on 2016/11/07 by Mark.Satterthwaite
Another iOS compile fix for new MetalViewport validation code.
Change 3188906 on 2016/11/07 by Rolando.Caloca
DR - Show permutation of LUTBlender
Change 3189094 on 2016/11/07 by Chris.Bunner
Fix RemoveAAJitter from projection matrix.
#jira UE-37701, UE-38003
Change 3189134 on 2016/11/07 by Daniel.Wright
Fix for CreateRenderTarget2D called in construction script during cooking
Change 3189145 on 2016/11/07 by Chris.Bunner
Follow-up to CL 3186818, export UMaterialExpressionVectorNoise.
Change 3189239 on 2016/11/07 by Daniel.Wright
Added show flag for Contact Shadows, disabled in planar reflections
Change 3189252 on 2016/11/07 by Daniel.Wright
Support for Reflection Capture intensity with simple reflections, which are the default with Forward Shading
Change 3189406 on 2016/11/07 by Mark.Satterthwaite
Really fix the last of the iOS compile errors from changes to the MetalViewport code.
Change 3190854 on 2016/11/08 by Ben.Woodhouse
XB1: Fix memory corruption with RHICreateVertexBuffer and RHICreateIndexBuffer when using initial data (Procedural Mesh Component crash)
#jira UE-34264
#fyi james.golding
#fyi keith.judge
Change 3190962 on 2016/11/08 by Olaf.Piesche
Unshelved from pending changelist '3176615' - Gil's fix for race condiiton with particle vertex factory reuse across different passes; potential to fix a number of issues
Change 3191959 on 2016/11/09 by Uriel.Doyon
Removed some static primitives from the dynamic primitive handler for texture streaming.
Change 3193122 on 2016/11/10 by Chris.Bunner
Always update non-preview material resources for use in code preview.
#jira UE-38223
Change 3193190 on 2016/11/10 by Gil.Gribb
UE4 - Fixed rare bug with shadow groups rendering things that have not been setup to render this frame.
#jira UE-36379
Change 3193523 on 2016/11/10 by Uriel.Doyon
Fixed incorrect section bounds used for texture streaming.
Change 3193962 on 2016/11/10 by Uriel.Doyon
Added defrag of dynamic bounds used for the texture streaming. Allows to remove unused bounds over time.
Change 3193974 on 2016/11/10 by Uriel.Doyon
New "Required Texture Resolution" view mode. Showing the ratio between the currently streamed texture resolution and the resolution wanted by the GPU.
Change 3194109 on 2016/11/10 by Uriel.Doyon
Another patch on material bounds used for texture streaming.
Change 3194665 on 2016/11/11 by Chris.Bunner
Duplicated behavior for inherited velocity scaling scaling to vert/surface spawned particles.
Change 3194734 on 2016/11/11 by Rolando.Caloca
DR - vk - Simplified some texture casting
Change 3194867 on 2016/11/11 by Rolando.Caloca
DR - vk - SM5 fixes
Change 3195176 on 2016/11/11 by Chris.Bunner
Fixed incorrectly updated NVAPI error.
Change 3195425 on 2016/11/11 by Uriel.Doyon
Fixed possible invalid level reference in the texture streamer
Change 3196512 on 2016/11/14 by Gil.Gribb
Merging //UE4/Dev-Main@3196156 to Dev-Rendering (//UE4/Dev-Rendering)
Change 3196750 on 2016/11/14 by Marcus.Wassmer
Fix ordering problem with GPU cache transitions
Change 3196815 on 2016/11/14 by Daniel.Wright
Suppressed 'Instanced stereo rendering is not supported' warning showing up in CIS
Change 3196818 on 2016/11/14 by Daniel.Wright
Fixed FIndirectLightingCache::UpdateCachePrimitivesInternal churning through a bunch of temporary memory
Change 3196819 on 2016/11/14 by Daniel.Wright
Volume lighting samples are allowed outside of the importance volume if their influence affects the volume. Fixes black indirect lighting on movable components in maps with small importance volumes.
Volume lighting samples placed on surfaces use a radius that covers the layer height spacing, which prevents an uncovered region between layers
Change 3197243 on 2016/11/14 by Uriel.Doyon
Async Task For Updating static component LastRender time
#jira UE-24268
Change 3197359 on 2016/11/14 by Daniel.Wright
Added Inscattering Texture controls to ExponentialHeightFog
* When InscatteringColorCubemap is specified, directional light inscattering is disabled
* Lerps betwen 1x1 mip at NonDirectionalInscatteringColorDistance to mip 0 at FullyDirectionalInscatteringColorDistance
* Added FogCutoffDistance, so artists can prevent fog on skyboxes (requires fog to be setup matching the fog that was rendered into the sky texture so that distant mountains match)
* Fog shader permutations based on what feature is enabled
Change 3198419 on 2016/11/15 by Chris.Bunner
PS4 HDR: Runtime toggle (backbuffer recreation on resize matching), UI composition. Matches PC behavior and controls.
HDR: Generalized buffer formats, cvar consistency pass, LUT for UI composition, refactoring common functions.
Exposed RHICreateTargetableShaderResource3D.
Moved some (translucent) volume rendering helpers to allow access in Slate.
Change 3198822 on 2016/11/15 by Daniel.Wright
Mac compile fix
Change 3199509 on 2016/11/15 by Uriel.Doyon
Added support for viewmode param asset name (and note just param value).
Used to investigate texture streamer behavior.
Change 3199578 on 2016/11/15 by Rolando.Caloca
DR - Add some shader resource tables to SCW when running with -directcompile
Change 3199698 on 2016/11/15 by Rolando.Caloca
DR - vk - Refactor shader & descriptor bindings
Change 3199712 on 2016/11/15 by Rolando.Caloca
DR - vk - r.Vulkan.StripGlsl to always strip glsl at runtime to save memory per shader
Change 3199717 on 2016/11/15 by Rolando.Caloca
DR - vk - Show hitching PSO info again
Change 3199750 on 2016/11/15 by Rolando.Caloca
DR - SCW clang compile fixes
Change 3200353 on 2016/11/16 by Rolando.Caloca
DR - vk - Mac fix
Change 3200358 on 2016/11/16 by Chris.Bunner
Only allow UI composition on platforms we currently use it.
Change 3200823 on 2016/11/16 by Chris.Bunner
Remove expression key attribute ID when not translating an attribute output to allow intended expression sharing.
#jira UE-38699
Change 3200947 on 2016/11/16 by Mark.Satterthwaite
Fix UE-38695 by not trying to resize the viewport on the wrong thread.
#jira UE-38695
Change 3201069 on 2016/11/16 by Daniel.Wright
Fog inscattering texture limited to SM4 and above, fixes ES2 compile errors
Change 3201346 on 2016/11/16 by Brian.Karis
Temporal AA fix for correct edge gradients.
Filtering now combined with importance sampling.
Enabled Catmull-Rom resolve filter. Results are now slightly sharper.
Fixed antighosting. Will yet require a dilation to be perfect.
Optimized bicubic filtering to 5 taps instead of 9.
Cleaned out unused code.
Change 3201369 on 2016/11/16 by Brian.Karis
Bicubic texture sample
Change 3201522 on 2016/11/16 by Rolando.Caloca
DR - vk - Fix static analysis issues
Change 3201878 on 2016/11/17 by Chris.Bunner
Temporarily disable Nvapi HDR error logging.
#jira UE-38529
Change 3202108 on 2016/11/17 by Simon.Tovey
Assets with easy repro for flickering particles bug
Change 3202181 on 2016/11/17 by Rolando.Caloca
DR - vk - CIS android fix
Change 3202325 on 2016/11/17 by Ben.Woodhouse
Integrate 4.14.1 fix from 14 //UE4/Release-4.14 (@3201850)
Fix CreateVertexbuffer and CreateIndexBuffer memory corruption (Procedural Mesh Component crash)
#jira UE-34264
Change 3204394 on 2016/11/18 by Guillaume.Abadie
PR #2808: AlphaComposite Fog Opacity fix (Contributed by moritz-wundke)
#br Ben.Woodhouse
Change 3204428 on 2016/11/18 by Guillaume.Abadie
Fixes a couple of issues in decals:
* Crash in FDecalDrawingPolicyFactory::DrawMesh()
* ActorPostion material expression
* PixelNormalWS material expression
* Missing renaming from DEFERRED_DECAL to DECAL_PRIMITIVE
#jira UE-38327, UE-38158, UE-37818, UE-37350
Change 3204429 on 2016/11/18 by Uriel.Doyon
Darker default undefined accuracy.
Reenabled the texture streaming build in the build all.
Change 3204458 on 2016/11/18 by Chris.Bunner
Shader truncation warnings fix.
Change 3204459 on 2016/11/18 by Chris.Bunner
Engine 'Passthrough' material fuction fix. V4 is now actually a V4.
Change 3204460 on 2016/11/18 by Chris.Bunner
Correctly handle some known Nvapi warnings.
#jira UE-38529
Change 3204653 on 2016/11/18 by Marc.Olano
Helper functions for tiled textures
Checking in for Ryan Brucks
Change 3204863 on 2016/11/18 by Arne.Schober
DR - Replaced ENQUEUE_UNIQUE_RENDER_COMMAND with a Debuggable template Implementation
Change 3204939 on 2016/11/18 by Arne.Schober
DR - Make clang happy
Change 3204968 on 2016/11/18 by Arne.Schober
DR - UE-38494 - Fixed SpeedTree Wind crash, when force deleting the Asset.
Change 3206293 on 2016/11/21 by Uriel.Doyon
New member bHasStreamingUpdatePending in UTexture2D to delay update of global distance fields.
Set to true when the streamer can possibly load a mip in the near future.
#jira UE-37787
Change 3206551 on 2016/11/21 by Chris.Bunner
Added material update context when forcing all shaders to recompile.
#jira UE-38481
Change 3206644 on 2016/11/21 by Benjamin.Hyder
Updating Planar Reflection example in TM-Shadermodels.
Change 3206899 on 2016/11/21 by Rolando.Caloca
DR - vk - SM5 fixes
Change 3206900 on 2016/11/21 by Rolando.Caloca
DR - Added missing strings for shader formats
Change 3206983 on 2016/11/21 by Rolando.Caloca
DR - vk - Support for SV_Coverage
Change 3207237 on 2016/11/22 by Simon.Tovey
Exporting particle module base and a couple of child classes as it's commonly requested.
#test compiles
Change 3207241 on 2016/11/22 by Gil.Gribb
Merging //UE4/Dev-Main@3206998 to Dev-Rendering (//UE4/Dev-Rendering)
Change 3207520 on 2016/11/22 by Ben.Woodhouse
Cherry picked from //Fortnite/Main@3206301
Fixed GPU hang in Zone Map view. Was an issue with RenderThread using the device context without appropriate RHIThread flushes.
#jira FORT-31616
#code_review keith.judge
Change 3207541 on 2016/11/22 by Ben.Woodhouse
Cherry picked from //fortnite/Main@3207422
* Fix UpdateTexture3D to create a staging texture of the region to update rather than the whole texture. This prevents distance fields crashing during update (allocating 18GB per frame in some cases)
* Put UpdateTexture2D DMA support onto a cvar, disabled by default (corruption issues reported by licensees, plus not sure it's actually faster - could be slower due to reduced bandwidth; issues reported by licensees)
* Fix UpdateTexture2D to only create a staging texture of the region to update, saving memory
#jira UE-38609
Change 3207654 on 2016/11/22 by Chris.Bunner
Don't flag 16-bit PNG/JPG textures as sRGB on import.
#jira UE-30279
Change 3208434 on 2016/11/22 by Rolando.Caloca
DR - vk - UAV transitions
Change 3208490 on 2016/11/22 by Chris.Bunner
Break material code sharing when we detect an unresolvable loop.
By default change IsResultMA loop detection to stop on functions as we can determine type definitively.
Unified IsResultMA detection across switch nodes.
Change 3208860 on 2016/11/23 by Rolando.Caloca
DR - vk - Fix some format issues
Change 3209265 on 2016/11/23 by Arne.Schober
DR - originally unshelved from 3153924 - Made Depth and Velocity Rendering Passes to use PSO only RHI interface,
We are now passing down two structs that collect all the necessary information for the drawing policies to construct a PSO object.
One during construction of the Policy, which contains information abouyt the CullMode, FillMode and PrimType.
And another during rendering that passes infomation like BlendState and DepthStencilState down to the low levelrenderer into SetSharedState.
Performance of the static drawlist ist slightly slower (less than 0.1ms on Consoles) due to some addtional branches and copies. The branches in the FDrawingPolicyRenderState will go away as soon as everything is converted to use the PSO interface.
Performace of the GPU is slightly better due to less context rolls (mainly CullMode sorts in differently now)
Change 3209305 on 2016/11/23 by Guillaume.Abadie
Fix contact shadow's assemption on objects thickness
Change 3209334 on 2016/11/23 by Brian.Karis
Fixed TAA handling of alpha. Switched the meaning of AA_ALPHA to make sense.
Change 3209903 on 2016/11/24 by Guillaume.Abadie
Cherry picks alpha through post processing changelists 3201959, 3204143 and 3209883 from //UE4/Private-Partner-NREAL
Change 3209973 on 2016/11/24 by Ben.Woodhouse
Fix D3D11 and 12 static analysis warnings reported by Rob Troughton of Coconut Lizard (http://coconutlizard.co.uk/blog/ue4/pvs-studio-part5/)
Change 3210023 on 2016/11/24 by Uriel.Doyon
Fixed an issue with DropDetail when FixedFrameRate was set to a value smaller than MinDesiredFrameRate.
#jira UE-37210
Change 3210026 on 2016/11/24 by Ben.Woodhouse
Disable renderthread hang detection if a debugger is present, so we can debug the renderthread without crashing
Change 3210049 on 2016/11/24 by Ben.Woodhouse
Fix mac build
Change 3210071 on 2016/11/24 by Uriel.Doyon
Fixed an issue with masked materials and shader complexity viewmode when DBuffer Decals are enabled.
#jira UE-37542
Change 3210374 on 2016/11/25 by Ben.Woodhouse
* Fix issues with fast cleared dbuffer targets not being resolved when no decals are in the scene. This caused graphical corruption on XB1 and ensure failures on PS4 (with RHIThread disabled)
* Move Decal rendertarget manager function implementations out of the header.
#jira UE-38879
Change 3210390 on 2016/11/25 by Uriel.Doyon
Fixed cubemap resourcesize not taking into account mipgen settings
#jira UE-37045
Change 3210407 on 2016/11/25 by Uriel.Doyon
"resavepackages" commandlet now supports -buildtexturestreaming that rebuilds the map texture streaming data.
That can be used in combination with -buildlighting.
Change 3210563 on 2016/11/27 by Rolando.Caloca
DR - vk - Integrate cached memory fixes and PF_D24 format fix
#jira UE-39025
PR #2974
Change 3210564 on 2016/11/27 by Rolando.Caloca
DR - Fix for GL linker
PR #2975
#jira UE-39029
Change 3210592 on 2016/11/27 by Rolando.Caloca
DR - vk - SM5 fixes
Change 3210597 on 2016/11/27 by Rolando.Caloca
DR - vk - Prep for staging UB copies to GPU memory
Change 3210600 on 2016/11/27 by Rolando.Caloca
DR - vk - Extract generic range code
Change 3210613 on 2016/11/27 by Rolando.Caloca
DR - vk - Added r.Vulkan.SubmitOnDispatch
Change 3211054 on 2016/11/28 by Rolando.Caloca
DR - vk - Missing reference
Change 3211330 on 2016/11/28 by Chris.Bunner
Shader compile error for max texture coordinate count on skinned meshes.
Change 3211384 on 2016/11/28 by Arne.Schober
DR - Enforce move on EnqueueRenderCommand Lambda
Change 3211431 on 2016/11/28 by Gil.Gribb
Merging //UE4/Dev-Main@3211016 to Dev-Rendering (//UE4/Dev-Rendering)
Change 3211738 on 2016/11/28 by Gil.Gribb
IWYU fixes after merge
Change 3212231 on 2016/11/28 by Richard.Wallis
Fix build errors
Change 3212253 on 2016/11/28 by Richard.Wallis
Remove MacGraphicsSwitching plugin.
#jira UE-37640
Change 3212310 on 2016/11/28 by Rolando.Caloca
DR - vk - Update glslang to 1.0.33.0
Change 3212446 on 2016/11/28 by Guillaume.Abadie
Implements PreviousFrameSwitch material expression
Change 3212594 on 2016/11/28 by Arne.Schober
DR - Fix missing include
Change 3212681 on 2016/11/29 by Rolando.Caloca
DR - vk - Auto flush for compute shader
Change 3213000 on 2016/11/29 by Gil.Gribb
temp fix for PF_MAX
Change 3213161 on 2016/11/29 by Ben.Woodhouse
Integrate latest D3D12 changes from //depot/Partners/Microsoft/UE4-DX12/...@3211714
Using:
- p4 integrate //depot/Partners/Microsoft/UE4-DX12/Engine/Source/Runtime/D3D12RHI/...@3211714 //UE4/Dev-Rendering/Engine/Source/Runtime/D3D12RHI/...
- p4 integrate //depot/Partners/Microsoft/UE4-DX12/Engine/Source/ThirdParty/Windows/DirectX/...@3211714 //UE4/Dev-Rendering/Engine/Source/ThirdParty/Windows/DirectX/...
- p4 integrate //depot/Partners/Microsoft/UE4-DX12/Engine/Source/Programs/UnrealBuildTool/...@3211714 //UE4/Dev-Rendering/Engine/Source/Programs/UnrealBuildTool/...
Changes from UE4-DX12:
*** CL 3183818 ***
Update D3D12 RHI to 4.14:
- Merged changes from Epic up until 10/20/16
- Fixed an issue where command allocators where resetting too early. I changed to aggressive command list batching by default now that more SubmitCommandListHint calls exist in the upper engine, we don't need to worry about starving the GPU. Fewer ExecuteCommandLists calls means better performance and fewer Signals() so this change provides a GPU perf win.
I had to fix an issue with aggressive batching where we would sometimes hold on to a command list long enough (in the pending list) but hadn't executed it yet. The command allocator was being put back in the queue of allocators during ReleaseCommandAllocator() without a syncpoint set and was thus being reset too early. I added a simple counter to the command allocator so it could track how many command lists were using it. It doesn't need to be thread safe since only one thread uses a command allocator at a time.
I also added some stats around the # command lists and # command allocators since it would be possible to leak command allocators now if it's pending command list count isn't decremented correctly. In that case we'd keep creating new command allocators and eventually run out of memory.
-Remove clear during allocate in the FD3D12FastConstantAllocator and FD3D12FastAllocator. The supplied resource locations are assumed to be new and thus don't need to be cleared.
-Cleanup D3D12RHI stats. There were some unused stats as well as some missing ones.
-Mark shader resource table uniform buffers as dirty only when the shader changes. Cleanup SetComputeShader calls and Dispatch calls to not set/unset the CS for each Dispatch.
-Remove unused Check SRV resolved code that epic added to the D3D11 RHI and was brought over. We dont need it and we won't use this.
-Remove "always on" cycle counters for high frequency RHI methods like RHISetShaderTexture. These should use the engine's stat macros as they are removed on TEST + SHIPPING builds. On Xbox a significant amount of CPU time is spent in things like QueryPerformanceCounter even when STATS aren't enabled. Currently 1% of an entire capture on XBOX is spent inside this call.
I improved and cleaned up high freqency call stacks like:
- RHISetShaderTexture
- RHISetShaderResourceViewParameter
- RHISetShaderParameter
- RHISetUAVParameter
In general I moved to use templated functions, removed unused parameters, unnecessary copies, etc.
-Change D3D12 PIX profiling enable/disable to match Xbox and handle logic in the UEBuildWindows.cs for UBT. Also add a static assert to inform the developer when PIX profiling is requested but the engine is compiling out draw events.
-Resources should be associated with the rendering thread's frame that it's currently recording command lists for and they shouldnt be cleaned up until those command lists have been translated to D3D12 command lists on the RHI thread AND completed executing on the GPU. This was confirmed to resolve an issue where CBV resources were being released too early.
This work involved a couple changes:
1) Move the "frame" fence to be incremented on the rendering thread (during RHIAdvanceFrameForGetViewportBackBuffer()) so that resources that are deleted from the rendering thread are assosicated with the correct frame count
2) Queue up a command from the rendering thread to signal the "frame" fence. It needs to be queued to ensure that it's signaled at the correct time on the RHI thread (after that frame's command lists have been executed).
-Disable GRHIRequiresEarlyBackBufferRenderTarget. Metal/Vulkan/Xbox11.x already do this. This is used by the Slate renderer during BeginRenderFrame and avoids a SetRenderTargets call.
-Enable GRHISupportsMSAADepthSampleAccess (used in the Editor). This was enabled for D3D11 on SM5, but not for D3D12.
-Delay load D3D12.dll and add root signature 1.1 support.
-Add explicit flush calls to improve resource barrier batching instead of implict flushes inside FConditionalScopeResourceBarrier and FScopeResourceBarrier. Also update those classes with const members.
*** CL 3183824 ***
Fix the D3D12 RHI after integrating UE 4.14 updates:
- Fixed a bug where we would try to get the PSO of a nullptr in SetPipelineState if we needed to reset the current PSO on the cmd list.
- Fixed a spelling error
- Removed the need for bForceState, we use dirty bits now
*** CL 3183830 ***
- GetDebugFlags RHI extension, needed by XB1 movie player.
- Only query memory info if stats are enabled
- Add support for the engine's new RHISubmitCommandsAndFlushGPU function
- Update CommitPendingPipelineState to be Graphics/Compute specific and avoid the need for a IsCompute parameter.
*** CL 3183837 ***
Made PipelineState caches contain pointers to FD3D12PipelineState objects to avoid issues with using pointers to after Find/Add to the maps. TMap indicates that the pointer to the value associated with a key "is only valid until the next change to any key in the map." The lifetime of the PSO pointers is managed by the low level caches (graphics and compute). Added stat for the number of Pipeline State Objects.
*** CL 3183931 ***
Update Windows D3D12 headers and libs to RS1 release bits (10.0.14393.0)
*** CL 3183978 ***
Update UBT Windows build settings:
- Change D3D12 PIX profiling enable/disable to match Xbox and handle logic in the UEBuildWindows.cs for UBT. Also add a static assert to inform the developer when PIX profiling is requested but the engine is compiling out draw events.
-Delay load D3D12.dll and add root signature 1.1 support.
*** CL 3184132 ***
Fix Xbox PSO cache code where it could leak PSOs. Related to change 3183837.
*** Changelist 3211714 ***
Update D3D12 RHI with fixes:
- Check if we can reserve slots in GatherUniqueSamplerTables
- DirtyState more often in StateCache
- Remove InternalSetSamplerState. The alternate function isn't used.
- Allow MRTClear for arrays with holes in them
- Fix uninitialized descriptors. This was causing a GPU hang on Xbox. We need to set dirty bits for resources bound to slots outside of the current descriptor table's range
- Cleanup SetDescriptorHeap code. Move setting descriptor heap logic to the descriptor cache since it also owns things like the sampler maps. Added members to the descriptor cache to track the last heaps that were set on the command list to avoid dirtying bit unnecessarily.
- Resource transitions: go through Common between queues (3D <--> Compute)
- Fix initial state for placed resources.
- Merging epic
Change 3213250 on 2016/11/29 by Chris.Bunner
GBufferHints tooltip fix.
#jira UE-39103
Change 3213345 on 2016/11/29 by Gil.Gribb
more IWYU fallout
Change 3213676 on 2016/11/29 by Rolando.Caloca
DR - Fix incorrect texture getting cleared
Change 3213728 on 2016/11/29 by Rolando.Caloca
DR - Lambda-ize
Change 3214461 on 2016/11/29 by Ben.Woodhouse
Rollout August QFE4 XDK (required for latest DX12 changes on XB1)
Change 3215317 on 2016/11/30 by Daniel.Wright
PS4 compile fix
Change 3216343 on 2016/11/30 by Arne.Schober
DR - UE-39155 - after talking to Brian it occurred to us that flipping the world space normal is non sensical. And indeed the Grass was using world space normals.
Change 3216844 on 2016/12/01 by Ben.Woodhouse
Fix for static analysis warnings after discussion with Microsoft
Change 3216916 on 2016/12/01 by Gil.Gribb
Merging //UE4/Dev-Main@3216539 to Dev-Rendering (//UE4/Dev-Rendering)
Change 3217385 on 2016/12/01 by Arne.Schober
DR - UE-39218, UE-39221, UE-39224 and potentially UE-39214 - The Stencil bits for Light channels and decal application were not set in the dynamic basepass
Change 3217464 on 2016/12/01 by Ben.Woodhouse
Fix for reflection capture resize assert. The assert is only valid in cooked builds, so disable it in editor
#jira UE-39225
Change 3217534 on 2016/12/01 by Arne.Schober
DR - Fix Merge conflict
Change 3217581 on 2016/12/01 by Rolando.Caloca
DR - Fix assert on debug
Change 3217741 on 2016/12/01 by Benjamin.Hyder
Duplicate audio fix.
Change 3217890 on 2016/12/01 by Rolando.Caloca
DR - Fix widget not rendering properly when hidden
#jira UE-39221
Change 3218129 on 2016/12/01 by Arne.Schober
DR - UE-39214 - Lod dither value as accidently cached accross the static draw list.
Change 3218759 on 2016/12/02 by Guillaume.Abadie
Fixes editor compositing bug caused by alpha through post processing change 3209903
#jira UE-39221
[CL 3219854 by Marcus Wassmer in Main branch]
2016-12-02 16:43:04 -05:00
}
2020-09-08 17:44:06 -04:00
ClipmapViewState . LastPartialUpdateOriginInPages = PageGridCenter ;
2015-05-11 20:04:15 -04:00
}
2020-09-08 17:44:06 -04:00
const FVector SnappedCenter = FVector ( ClipmapViewState . LastPartialUpdateOriginInPages ) * ClipmapPageSize ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
const FGlobalDFCacheType StartCacheType = GAOGlobalDistanceFieldCacheMostlyStaticSeparately ? GDF_MostlyStatic : GDF_Full ;
2015-05-11 20:04:15 -04:00
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
for ( uint32 CacheType = StartCacheType ; CacheType < GDF_Num ; CacheType + + )
{
FGlobalDistanceFieldClipmap & Clipmap = * ( CacheType = = GDF_MostlyStatic
? & GlobalDistanceFieldInfo . MostlyStaticClipmaps [ ClipmapIndex ]
: & GlobalDistanceFieldInfo . Clipmaps [ ClipmapIndex ] ) ;
// Setup clipmap properties from view state exclusively, so we can skip updating on some frames
2020-09-08 17:44:06 -04:00
Clipmap . Bounds = FBox ( SnappedCenter - ClipmapExtent , SnappedCenter + ClipmapExtent ) ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
// Scroll offset so the contents of the global distance field don't have to be moved as the camera moves around, only updated in slabs
2020-09-08 17:44:06 -04:00
Clipmap . ScrollOffset = FVector ( ClipmapViewState . LastPartialUpdateOriginInPages - ClipmapViewState . FullUpdateOriginInPages ) * ClipmapPageSize ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
}
2016-04-04 18:44:59 -04:00
2022-02-02 07:59:31 -05:00
ClipmapViewState . CachedClipmapCenter = ( FVector3f ) SnappedCenter ;
2020-09-08 17:44:06 -04:00
ClipmapViewState . CachedClipmapExtent = ClipmapExtent ;
2021-06-16 17:48:21 -04:00
ClipmapViewState . CacheClipmapInfluenceRadius = ClipmapInfluenceRadius ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
ClipmapViewState . CacheMostlyStaticSeparately = GAOGlobalDistanceFieldCacheMostlyStaticSeparately ;
2015-05-11 20:04:15 -04:00
}
2020-07-06 18:58:26 -04:00
ensureMsgf ( GAOGlobalDistanceFieldStaggeredUpdates | | NumClipmapUpdateRequests < = GetNumClipmapUpdatesPerFrame ( ) , TEXT ( " ShouldUpdateClipmapThisFrame needs to be adjusted for the NumClipmaps to even out the work distribution " ) ) ;
2015-05-11 20:04:15 -04:00
}
else
{
for ( int32 ClipmapIndex = 0 ; ClipmapIndex < NumClipmaps ; ClipmapIndex + + )
{
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
const FGlobalDFCacheType StartCacheType = GAOGlobalDistanceFieldCacheMostlyStaticSeparately ? GDF_MostlyStatic : GDF_Full ;
2015-05-11 20:04:15 -04:00
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
for ( uint32 CacheType = StartCacheType ; CacheType < GDF_Num ; CacheType + + )
{
FGlobalDistanceFieldClipmap & Clipmap = * ( CacheType = = GDF_MostlyStatic
? & GlobalDistanceFieldInfo . MostlyStaticClipmaps [ ClipmapIndex ]
: & GlobalDistanceFieldInfo . Clipmaps [ ClipmapIndex ] ) ;
2015-05-11 20:04:15 -04:00
2020-09-08 17:44:06 -04:00
Clipmap . ScrollOffset = FVector ( 0 ) ;
2015-05-11 20:04:15 -04:00
2021-02-04 15:30:42 -04:00
const int32 ClipmapResolution = GlobalDistanceField : : GetClipmapResolution ( bLumenEnabled ) ;
const float Extent = GlobalDistanceField : : GetClipmapExtent ( ClipmapIndex , Scene , bLumenEnabled ) ;
2020-09-08 17:44:06 -04:00
const float ClipmapVoxelSize = ( 2.0f * Extent ) / ClipmapResolution ;
const float ClipmapPageSize = GGlobalDistanceFieldPageResolution * ClipmapVoxelSize ;
2021-02-04 15:30:42 -04:00
const FVector GlobalDistanceFieldViewOrigin = GetGlobalDistanceFieldViewOrigin ( View , ClipmapIndex , bLumenEnabled ) ;
2015-05-11 20:04:15 -04:00
2020-09-08 17:44:06 -04:00
FIntVector PageGridCenter ;
PageGridCenter . X = FMath : : RoundToInt ( GlobalDistanceFieldViewOrigin . X / ClipmapPageSize ) ;
PageGridCenter . Y = FMath : : RoundToInt ( GlobalDistanceFieldViewOrigin . Y / ClipmapPageSize ) ;
PageGridCenter . Z = FMath : : RoundToInt ( GlobalDistanceFieldViewOrigin . Z / ClipmapPageSize ) ;
2015-05-11 20:04:15 -04:00
2020-09-08 17:44:06 -04:00
FVector Center = FVector ( PageGridCenter ) * ClipmapPageSize ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
FBox ClipmapBounds ( Center - Extent , Center + Extent ) ;
Clipmap . Bounds = ClipmapBounds ;
2020-07-06 18:58:26 -04:00
Clipmap . UpdateBounds . Reset ( ) ;
Clipmap . UpdateBounds . Add ( FClipmapUpdateBounds ( ClipmapBounds . GetCenter ( ) , ClipmapBounds . GetExtent ( ) , false ) ) ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
}
2015-05-11 20:04:15 -04:00
}
}
2015-05-18 13:21:23 -04:00
2022-01-26 17:07:27 -05:00
GlobalDistanceFieldInfo . UpdateParameterData ( MaxOcclusionDistance , bLumenEnabled , View . FinalPostProcessSettings . LumenSceneViewDistance ) ;
2015-05-11 20:04:15 -04:00
}
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3091903)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3072947 on 2016/08/01 by Uriel.Doyon
Texture GUIDs are now included in cooked builds, as they are required by the texture streamer to link build data to in game texture.
#review-3072934 @marcus.wassmer
#jira UE-34045
Change 3073301 on 2016/08/02 by Ben.Woodhouse
Fix for large spotlight culling precision issues, reported on UDN by Aaron Jacobs at Double Fine. For a full description, see the UDN post
https://udn.unrealengine.com/questions/305440/shadowed-light-flicker-caused-by-floating-point-pr.html
#jira UE-34052
Change 3073689 on 2016/08/02 by Ben.Woodhouse
Improved skin postprocess - support for full resolution, with diffuse/spec lighting combined into single RGBA (sharing chroma)
Full res lighting gives less temporal AA flickering, sharper diffuse and specular lighting in the surface (since this is now at full resolution), faster postprocessing if using a 64-bit rendertarget (on NV 980Ti).
Checkerboard rendering is controlled via the r.sss.checkerboard cvar. - 0 is off/full res, 1 is checkerboard, 2 is automatic based on scenecolor (non-checkerboard requires 64bit or more rendertarget w/separate alpha)
Tested/profiled on PC, PS4
Change 3074666 on 2016/08/02 by Daniel.Wright
Fixed stationary skylight brightness
Change 3074667 on 2016/08/02 by Daniel.Wright
Fixed r.ReflectionEnvironmentLightmapMixing
Change 3074687 on 2016/08/02 by Daniel.Wright
Disallowed DrawMaterialToRenderTarget and Begin/EndDrawCanvasToRenderTarget in construction scripts, since they don't work in game. Blutilities can be used to do blueprint rendering in the editor.
Change 3075241 on 2016/08/03 by Rolando.Caloca
DR - Fix linux compile issue & static analysis warning
Change 3075746 on 2016/08/03 by Daniel.Wright
Removed bOverride_AntiAliasingMethod and outdated ini references to PP AntiAliasingMethod
Change 3075783 on 2016/08/03 by Ryan.Brucks
#code.review Marcus.Wassmer
Added two material nodes that return Atmospheric Light Vector and Light Direction using:
View.AtmosphericFogSunColor
View.AtmosphericFogSunDirection
Nodes are called:
AtmosphericLightVector
AtmosphericLightColor
Also changed SceneRendering.cpp so that values will be grabbed from directional lights without needing an Atmospheric Fog actor in the scene.
Change 3075969 on 2016/08/03 by Uriel.Doyon
Material GUIDs are not updated anymore when parents or textures change.
Lighting now uses a hash built from the list of parents, textures and shader functions.
#review-3072980 @marcus.wassmer @daniel.wright
Change 3076116 on 2016/08/03 by Ryan.Brucks
#code.review marcus.wassmer
Fixed typo in the Caption of new Nodes "Atmospheric Light Vector" and "Atmospheric Light Color"
Change 3076456 on 2016/08/03 by Rolando.Caloca
DR - Fix geometry shader gl_Layer for SPIR-V
Change 3076730 on 2016/08/03 by Uriel.Doyon
Added user warning logic for the texture streaming build. Ran in MapCheck, BeginPlay and PreSave.
#review-3072984 @marcus.wassmer
Change 3077616 on 2016/08/04 by Daniel.Wright
Planar reflection show flags can now be edited
Change 3077621 on 2016/08/04 by Daniel.Wright
Changed default Planar Reflection DistanceFromPlaneFadeoutEnd from 600 to 100, which reduces artifacts and is a more intuitive initial setting
Change 3077792 on 2016/08/04 by Daniel.Wright
Fixed an unnecessary sky capture caused by the sky light component owned by the default ASkyLight
Change 3077799 on 2016/08/04 by Daniel.Wright
Skip RF_ArchetypeObject for reflection captures
Change 3077876 on 2016/08/04 by Marc.Olano
Noise material perf improvements
Change random number generator for Gradient-ALU (1.7x perf boost), improve speed of Voronoi noise quality level 3.
Removes integer BBS random number generators. Fewer instructions, but too slow to use (see 1.7x perf boost above)
Change 3077884 on 2016/08/04 by Daniel.Wright
Lighting channels can now be edited on components with static mobility, since dynamic lights can still affect them
Change 3078994 on 2016/08/05 by Simon.Tovey
Fix for UE-34241
Scene proxy ptr was being cached during a downcast.
Inside a call to CreateDynamicData, CheckMaterialUsage_Concurrent() was causing the scene proxy to be recreated an so the cached ptr was stale.
I've fixed the immediate issue but recreating the scene proxy here doesn't seem great. Maybe CheckMaterailUsage() should be rethought a bit.
Change 3079162 on 2016/08/05 by Ben.Woodhouse
Fix for jittering in Paper2D. Was caused by override being ignored due to a change in intiialization order for AA settings.
#jira UE-34091
Change 3079613 on 2016/08/05 by Daniel.Wright
New blueprint function ClearRenderTarget2D, which is the only way to set a render target alpha directly
New blueprint function CreateRenderTarget2D
Change 3079708 on 2016/08/05 by Uriel.Doyon
Fixed crash when building texture streaming on some levels.
Change 3079795 on 2016/08/05 by Uriel.Doyon
Fixed issue with instanced static meshes when building texture streaming.
Fixed typo with func "GetNumTextureStreamingPrimitives"
Change 3079806 on 2016/08/05 by Uriel.Doyon
Enabled PerTexture MipBias. The per texture mip bias now resets to 0 when the texture gets required at low resolution.
New scalability setting named "r.Streaming.LimitPoolSizeToVRAM" enabling the PoolSize to be limited the available VRAM
(according to GPoolSizeVRAMPercentage)
#review-3074662 @marcus.wassmer
Change 3082698 on 2016/08/09 by Daniel.Wright
Copy - CreateRenderTarget2D uses a world context object as owner, allows use in a construction script
Change 3082699 on 2016/08/09 by Daniel.Wright
Changed display name for 'Two Sided' shading model to 'Two Sided Foliage' to make it clear what it's intended to be used for
Change 3083909 on 2016/08/10 by Olaf.Piesche
#jira UE-34106
#jira UE-32784
#jira UE-31198
Reset vertex factories on mesh emitters if mesh has been reimported (if mesh package is dirty)
Change 3084645 on 2016/08/10 by Olaf.Piesche
#jira UE-30398
Fix offset added to particle collision locations.
Change 3084709 on 2016/08/10 by Daniel.Wright
Copy - Scene capture alpha is now inverted to match DrawMaterialToRenderTarget, and to allow compositing with existing render target contents
Added CompositeMode to SceneCapture2D, which can be used to addively accumulate or composite instead of the default overwrite behavior
Added bCaptureOnMovement to SceneCapture, which can be disabled so the only source of scene capturing is a manual capture by calling CaptureScene()
Change 3084783 on 2016/08/10 by Rolando.Caloca
DR - Use the first targeted rhi shader platform as the initial RHI to load on Windows
#jira UE-34510
Change 3084958 on 2016/08/10 by Daniel.Wright
Copy - Reverted cl 2938543 "Lightmass now respects owner bHidden, and bCastHiddenShadow" because it did not have backwards compatibility so breaks content using hidden light cards
Change 3086023 on 2016/08/11 by Marcus.Wassmer
Merging //UE4/Dev-Main@3085468 to Dev-Rendering (//UE4/Dev-Rendering)
#test none
Change 3086778 on 2016/08/11 by Ben.Woodhouse
Workaround for fortnite character rendering issue. Enable checkerboard rendering by default until we can fix properly
#jira UE-34561
Change 3087404 on 2016/08/12 by Rolando.Caloca
DR - Upgrade glslang to 1.0.21.1
- Added some more debug output
Change 3087524 on 2016/08/12 by Rolando.Caloca
DR - vk - Fixed StencilRef, fixed size of RHIReadSurfaceFloatData (but still returns dummy data)
Change 3087663 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix for SRGB; support for mip texture views
Change 3087735 on 2016/08/12 by Daniel.Wright
TextureRenderTarget2D's can now be up to 8192^2. Anything over 2048 pops up an 'are you sure' dialog.
Change 3087750 on 2016/08/12 by Rolando.Caloca
DR - vk - Minor renaming in prep for merge
Change 3087813 on 2016/08/12 by Rolando.Caloca
DR - vk - More minor cleanup
Change 3087819 on 2016/08/12 by Chris.Bunner
Check material function input types directly, no need to traverse connected graph.
#jira UE-32134
Change 3087901 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix RT view to use 1 mip
Fix depth buffer component swizzle
Change 3088193 on 2016/08/12 by Daniel.Wright
DFAO and RTDF shadows are enabled in High and Epic scalability settings by default
Change 3088988 on 2016/08/15 by Rolando.Caloca
DR - Add Accessors
Change 3089104 on 2016/08/15 by Olaf.Piesche
#jira UE-34241
Sceneproxy can be nullptr in FDynamicMeshEmitterData::Init if the proxy is being recreated
Change 3089208 on 2016/08/15 by Daniel.Wright
Downsampled separate translucency uses a separate view uniform buffer with correct buffer sizes
* Fixes WorldPosition in downsampled translucency
* View uniform buffer parameters are now cached on the view, to allow recreating the uniform buffer without having to rebuild the entire struct. Currently used by global distance field, downsampled separate translucency.
* Fixed the downsampled translucency depth buffer being full res used together with a smaller color target, now they are both the downsampled res
Change 3089209 on 2016/08/15 by Daniel.Wright
Fixed atmospheric fog on translucency
Change 3089457 on 2016/08/15 by Daniel.Wright
Fixed lighting build failure from UMaterialInstanceDynamic assigned to a mesh that's being exported to Lightmass. The Swarm cache entry is created using the parent's guid, causing multiple MID's with the same parent to acquire a file handle multiple times which fails after the first.
Change 3089549 on 2016/08/15 by Daniel.Wright
UMaterialInterface initializes LightingGuid to something valid - causes UMaterialInstanceDynamic to have a valid LightingGuid so they can be used in lighting builds
Change 3089703 on 2016/08/15 by Daniel.Wright
Custom expression fixup for View.RenderTargetSize
Change 3090546 on 2016/08/16 by Daniel.Wright
Hopeful fix for recycled snapshot view crash
Change 3091202 on 2016/08/16 by Daniel.Wright
Manually clear FViewInfo::CachedViewUniformShaderParameters on creating a snapshot, since memcpy is used to create the snapshot view
[CL 3091931 by Gil Gribb in Main branch]
2016-08-17 11:38:13 -04:00
void FViewInfo : : SetupDefaultGlobalDistanceFieldUniformBufferParameters ( FViewUniformShaderParameters & ViewUniformShaderParameters ) const
{
// Initialize global distance field members to defaults, because View.GlobalDistanceFieldInfo is not valid yet
2022-04-22 19:55:41 -04:00
for ( int32 Index = 0 ; Index < GlobalDistanceField : : MaxClipmaps ; Index + + )
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3091903)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3072947 on 2016/08/01 by Uriel.Doyon
Texture GUIDs are now included in cooked builds, as they are required by the texture streamer to link build data to in game texture.
#review-3072934 @marcus.wassmer
#jira UE-34045
Change 3073301 on 2016/08/02 by Ben.Woodhouse
Fix for large spotlight culling precision issues, reported on UDN by Aaron Jacobs at Double Fine. For a full description, see the UDN post
https://udn.unrealengine.com/questions/305440/shadowed-light-flicker-caused-by-floating-point-pr.html
#jira UE-34052
Change 3073689 on 2016/08/02 by Ben.Woodhouse
Improved skin postprocess - support for full resolution, with diffuse/spec lighting combined into single RGBA (sharing chroma)
Full res lighting gives less temporal AA flickering, sharper diffuse and specular lighting in the surface (since this is now at full resolution), faster postprocessing if using a 64-bit rendertarget (on NV 980Ti).
Checkerboard rendering is controlled via the r.sss.checkerboard cvar. - 0 is off/full res, 1 is checkerboard, 2 is automatic based on scenecolor (non-checkerboard requires 64bit or more rendertarget w/separate alpha)
Tested/profiled on PC, PS4
Change 3074666 on 2016/08/02 by Daniel.Wright
Fixed stationary skylight brightness
Change 3074667 on 2016/08/02 by Daniel.Wright
Fixed r.ReflectionEnvironmentLightmapMixing
Change 3074687 on 2016/08/02 by Daniel.Wright
Disallowed DrawMaterialToRenderTarget and Begin/EndDrawCanvasToRenderTarget in construction scripts, since they don't work in game. Blutilities can be used to do blueprint rendering in the editor.
Change 3075241 on 2016/08/03 by Rolando.Caloca
DR - Fix linux compile issue & static analysis warning
Change 3075746 on 2016/08/03 by Daniel.Wright
Removed bOverride_AntiAliasingMethod and outdated ini references to PP AntiAliasingMethod
Change 3075783 on 2016/08/03 by Ryan.Brucks
#code.review Marcus.Wassmer
Added two material nodes that return Atmospheric Light Vector and Light Direction using:
View.AtmosphericFogSunColor
View.AtmosphericFogSunDirection
Nodes are called:
AtmosphericLightVector
AtmosphericLightColor
Also changed SceneRendering.cpp so that values will be grabbed from directional lights without needing an Atmospheric Fog actor in the scene.
Change 3075969 on 2016/08/03 by Uriel.Doyon
Material GUIDs are not updated anymore when parents or textures change.
Lighting now uses a hash built from the list of parents, textures and shader functions.
#review-3072980 @marcus.wassmer @daniel.wright
Change 3076116 on 2016/08/03 by Ryan.Brucks
#code.review marcus.wassmer
Fixed typo in the Caption of new Nodes "Atmospheric Light Vector" and "Atmospheric Light Color"
Change 3076456 on 2016/08/03 by Rolando.Caloca
DR - Fix geometry shader gl_Layer for SPIR-V
Change 3076730 on 2016/08/03 by Uriel.Doyon
Added user warning logic for the texture streaming build. Ran in MapCheck, BeginPlay and PreSave.
#review-3072984 @marcus.wassmer
Change 3077616 on 2016/08/04 by Daniel.Wright
Planar reflection show flags can now be edited
Change 3077621 on 2016/08/04 by Daniel.Wright
Changed default Planar Reflection DistanceFromPlaneFadeoutEnd from 600 to 100, which reduces artifacts and is a more intuitive initial setting
Change 3077792 on 2016/08/04 by Daniel.Wright
Fixed an unnecessary sky capture caused by the sky light component owned by the default ASkyLight
Change 3077799 on 2016/08/04 by Daniel.Wright
Skip RF_ArchetypeObject for reflection captures
Change 3077876 on 2016/08/04 by Marc.Olano
Noise material perf improvements
Change random number generator for Gradient-ALU (1.7x perf boost), improve speed of Voronoi noise quality level 3.
Removes integer BBS random number generators. Fewer instructions, but too slow to use (see 1.7x perf boost above)
Change 3077884 on 2016/08/04 by Daniel.Wright
Lighting channels can now be edited on components with static mobility, since dynamic lights can still affect them
Change 3078994 on 2016/08/05 by Simon.Tovey
Fix for UE-34241
Scene proxy ptr was being cached during a downcast.
Inside a call to CreateDynamicData, CheckMaterialUsage_Concurrent() was causing the scene proxy to be recreated an so the cached ptr was stale.
I've fixed the immediate issue but recreating the scene proxy here doesn't seem great. Maybe CheckMaterailUsage() should be rethought a bit.
Change 3079162 on 2016/08/05 by Ben.Woodhouse
Fix for jittering in Paper2D. Was caused by override being ignored due to a change in intiialization order for AA settings.
#jira UE-34091
Change 3079613 on 2016/08/05 by Daniel.Wright
New blueprint function ClearRenderTarget2D, which is the only way to set a render target alpha directly
New blueprint function CreateRenderTarget2D
Change 3079708 on 2016/08/05 by Uriel.Doyon
Fixed crash when building texture streaming on some levels.
Change 3079795 on 2016/08/05 by Uriel.Doyon
Fixed issue with instanced static meshes when building texture streaming.
Fixed typo with func "GetNumTextureStreamingPrimitives"
Change 3079806 on 2016/08/05 by Uriel.Doyon
Enabled PerTexture MipBias. The per texture mip bias now resets to 0 when the texture gets required at low resolution.
New scalability setting named "r.Streaming.LimitPoolSizeToVRAM" enabling the PoolSize to be limited the available VRAM
(according to GPoolSizeVRAMPercentage)
#review-3074662 @marcus.wassmer
Change 3082698 on 2016/08/09 by Daniel.Wright
Copy - CreateRenderTarget2D uses a world context object as owner, allows use in a construction script
Change 3082699 on 2016/08/09 by Daniel.Wright
Changed display name for 'Two Sided' shading model to 'Two Sided Foliage' to make it clear what it's intended to be used for
Change 3083909 on 2016/08/10 by Olaf.Piesche
#jira UE-34106
#jira UE-32784
#jira UE-31198
Reset vertex factories on mesh emitters if mesh has been reimported (if mesh package is dirty)
Change 3084645 on 2016/08/10 by Olaf.Piesche
#jira UE-30398
Fix offset added to particle collision locations.
Change 3084709 on 2016/08/10 by Daniel.Wright
Copy - Scene capture alpha is now inverted to match DrawMaterialToRenderTarget, and to allow compositing with existing render target contents
Added CompositeMode to SceneCapture2D, which can be used to addively accumulate or composite instead of the default overwrite behavior
Added bCaptureOnMovement to SceneCapture, which can be disabled so the only source of scene capturing is a manual capture by calling CaptureScene()
Change 3084783 on 2016/08/10 by Rolando.Caloca
DR - Use the first targeted rhi shader platform as the initial RHI to load on Windows
#jira UE-34510
Change 3084958 on 2016/08/10 by Daniel.Wright
Copy - Reverted cl 2938543 "Lightmass now respects owner bHidden, and bCastHiddenShadow" because it did not have backwards compatibility so breaks content using hidden light cards
Change 3086023 on 2016/08/11 by Marcus.Wassmer
Merging //UE4/Dev-Main@3085468 to Dev-Rendering (//UE4/Dev-Rendering)
#test none
Change 3086778 on 2016/08/11 by Ben.Woodhouse
Workaround for fortnite character rendering issue. Enable checkerboard rendering by default until we can fix properly
#jira UE-34561
Change 3087404 on 2016/08/12 by Rolando.Caloca
DR - Upgrade glslang to 1.0.21.1
- Added some more debug output
Change 3087524 on 2016/08/12 by Rolando.Caloca
DR - vk - Fixed StencilRef, fixed size of RHIReadSurfaceFloatData (but still returns dummy data)
Change 3087663 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix for SRGB; support for mip texture views
Change 3087735 on 2016/08/12 by Daniel.Wright
TextureRenderTarget2D's can now be up to 8192^2. Anything over 2048 pops up an 'are you sure' dialog.
Change 3087750 on 2016/08/12 by Rolando.Caloca
DR - vk - Minor renaming in prep for merge
Change 3087813 on 2016/08/12 by Rolando.Caloca
DR - vk - More minor cleanup
Change 3087819 on 2016/08/12 by Chris.Bunner
Check material function input types directly, no need to traverse connected graph.
#jira UE-32134
Change 3087901 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix RT view to use 1 mip
Fix depth buffer component swizzle
Change 3088193 on 2016/08/12 by Daniel.Wright
DFAO and RTDF shadows are enabled in High and Epic scalability settings by default
Change 3088988 on 2016/08/15 by Rolando.Caloca
DR - Add Accessors
Change 3089104 on 2016/08/15 by Olaf.Piesche
#jira UE-34241
Sceneproxy can be nullptr in FDynamicMeshEmitterData::Init if the proxy is being recreated
Change 3089208 on 2016/08/15 by Daniel.Wright
Downsampled separate translucency uses a separate view uniform buffer with correct buffer sizes
* Fixes WorldPosition in downsampled translucency
* View uniform buffer parameters are now cached on the view, to allow recreating the uniform buffer without having to rebuild the entire struct. Currently used by global distance field, downsampled separate translucency.
* Fixed the downsampled translucency depth buffer being full res used together with a smaller color target, now they are both the downsampled res
Change 3089209 on 2016/08/15 by Daniel.Wright
Fixed atmospheric fog on translucency
Change 3089457 on 2016/08/15 by Daniel.Wright
Fixed lighting build failure from UMaterialInstanceDynamic assigned to a mesh that's being exported to Lightmass. The Swarm cache entry is created using the parent's guid, causing multiple MID's with the same parent to acquire a file handle multiple times which fails after the first.
Change 3089549 on 2016/08/15 by Daniel.Wright
UMaterialInterface initializes LightingGuid to something valid - causes UMaterialInstanceDynamic to have a valid LightingGuid so they can be used in lighting builds
Change 3089703 on 2016/08/15 by Daniel.Wright
Custom expression fixup for View.RenderTargetSize
Change 3090546 on 2016/08/16 by Daniel.Wright
Hopeful fix for recycled snapshot view crash
Change 3091202 on 2016/08/16 by Daniel.Wright
Manually clear FViewInfo::CachedViewUniformShaderParameters on creating a snapshot, since memcpy is used to create the snapshot view
[CL 3091931 by Gil Gribb in Main branch]
2016-08-17 11:38:13 -04:00
{
2021-09-22 10:01:48 -04:00
ViewUniformShaderParameters . GlobalVolumeCenterAndExtent [ Index ] = FVector4f ( 0 ) ;
ViewUniformShaderParameters . GlobalVolumeWorldToUVAddAndMul [ Index ] = FVector4f ( 0 ) ;
ViewUniformShaderParameters . GlobalDistanceFieldMipWorldToUVScale [ Index ] = FVector4f ( 0 ) ;
ViewUniformShaderParameters . GlobalDistanceFieldMipWorldToUVBias [ Index ] = FVector4f ( 0 ) ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3091903)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3072947 on 2016/08/01 by Uriel.Doyon
Texture GUIDs are now included in cooked builds, as they are required by the texture streamer to link build data to in game texture.
#review-3072934 @marcus.wassmer
#jira UE-34045
Change 3073301 on 2016/08/02 by Ben.Woodhouse
Fix for large spotlight culling precision issues, reported on UDN by Aaron Jacobs at Double Fine. For a full description, see the UDN post
https://udn.unrealengine.com/questions/305440/shadowed-light-flicker-caused-by-floating-point-pr.html
#jira UE-34052
Change 3073689 on 2016/08/02 by Ben.Woodhouse
Improved skin postprocess - support for full resolution, with diffuse/spec lighting combined into single RGBA (sharing chroma)
Full res lighting gives less temporal AA flickering, sharper diffuse and specular lighting in the surface (since this is now at full resolution), faster postprocessing if using a 64-bit rendertarget (on NV 980Ti).
Checkerboard rendering is controlled via the r.sss.checkerboard cvar. - 0 is off/full res, 1 is checkerboard, 2 is automatic based on scenecolor (non-checkerboard requires 64bit or more rendertarget w/separate alpha)
Tested/profiled on PC, PS4
Change 3074666 on 2016/08/02 by Daniel.Wright
Fixed stationary skylight brightness
Change 3074667 on 2016/08/02 by Daniel.Wright
Fixed r.ReflectionEnvironmentLightmapMixing
Change 3074687 on 2016/08/02 by Daniel.Wright
Disallowed DrawMaterialToRenderTarget and Begin/EndDrawCanvasToRenderTarget in construction scripts, since they don't work in game. Blutilities can be used to do blueprint rendering in the editor.
Change 3075241 on 2016/08/03 by Rolando.Caloca
DR - Fix linux compile issue & static analysis warning
Change 3075746 on 2016/08/03 by Daniel.Wright
Removed bOverride_AntiAliasingMethod and outdated ini references to PP AntiAliasingMethod
Change 3075783 on 2016/08/03 by Ryan.Brucks
#code.review Marcus.Wassmer
Added two material nodes that return Atmospheric Light Vector and Light Direction using:
View.AtmosphericFogSunColor
View.AtmosphericFogSunDirection
Nodes are called:
AtmosphericLightVector
AtmosphericLightColor
Also changed SceneRendering.cpp so that values will be grabbed from directional lights without needing an Atmospheric Fog actor in the scene.
Change 3075969 on 2016/08/03 by Uriel.Doyon
Material GUIDs are not updated anymore when parents or textures change.
Lighting now uses a hash built from the list of parents, textures and shader functions.
#review-3072980 @marcus.wassmer @daniel.wright
Change 3076116 on 2016/08/03 by Ryan.Brucks
#code.review marcus.wassmer
Fixed typo in the Caption of new Nodes "Atmospheric Light Vector" and "Atmospheric Light Color"
Change 3076456 on 2016/08/03 by Rolando.Caloca
DR - Fix geometry shader gl_Layer for SPIR-V
Change 3076730 on 2016/08/03 by Uriel.Doyon
Added user warning logic for the texture streaming build. Ran in MapCheck, BeginPlay and PreSave.
#review-3072984 @marcus.wassmer
Change 3077616 on 2016/08/04 by Daniel.Wright
Planar reflection show flags can now be edited
Change 3077621 on 2016/08/04 by Daniel.Wright
Changed default Planar Reflection DistanceFromPlaneFadeoutEnd from 600 to 100, which reduces artifacts and is a more intuitive initial setting
Change 3077792 on 2016/08/04 by Daniel.Wright
Fixed an unnecessary sky capture caused by the sky light component owned by the default ASkyLight
Change 3077799 on 2016/08/04 by Daniel.Wright
Skip RF_ArchetypeObject for reflection captures
Change 3077876 on 2016/08/04 by Marc.Olano
Noise material perf improvements
Change random number generator for Gradient-ALU (1.7x perf boost), improve speed of Voronoi noise quality level 3.
Removes integer BBS random number generators. Fewer instructions, but too slow to use (see 1.7x perf boost above)
Change 3077884 on 2016/08/04 by Daniel.Wright
Lighting channels can now be edited on components with static mobility, since dynamic lights can still affect them
Change 3078994 on 2016/08/05 by Simon.Tovey
Fix for UE-34241
Scene proxy ptr was being cached during a downcast.
Inside a call to CreateDynamicData, CheckMaterialUsage_Concurrent() was causing the scene proxy to be recreated an so the cached ptr was stale.
I've fixed the immediate issue but recreating the scene proxy here doesn't seem great. Maybe CheckMaterailUsage() should be rethought a bit.
Change 3079162 on 2016/08/05 by Ben.Woodhouse
Fix for jittering in Paper2D. Was caused by override being ignored due to a change in intiialization order for AA settings.
#jira UE-34091
Change 3079613 on 2016/08/05 by Daniel.Wright
New blueprint function ClearRenderTarget2D, which is the only way to set a render target alpha directly
New blueprint function CreateRenderTarget2D
Change 3079708 on 2016/08/05 by Uriel.Doyon
Fixed crash when building texture streaming on some levels.
Change 3079795 on 2016/08/05 by Uriel.Doyon
Fixed issue with instanced static meshes when building texture streaming.
Fixed typo with func "GetNumTextureStreamingPrimitives"
Change 3079806 on 2016/08/05 by Uriel.Doyon
Enabled PerTexture MipBias. The per texture mip bias now resets to 0 when the texture gets required at low resolution.
New scalability setting named "r.Streaming.LimitPoolSizeToVRAM" enabling the PoolSize to be limited the available VRAM
(according to GPoolSizeVRAMPercentage)
#review-3074662 @marcus.wassmer
Change 3082698 on 2016/08/09 by Daniel.Wright
Copy - CreateRenderTarget2D uses a world context object as owner, allows use in a construction script
Change 3082699 on 2016/08/09 by Daniel.Wright
Changed display name for 'Two Sided' shading model to 'Two Sided Foliage' to make it clear what it's intended to be used for
Change 3083909 on 2016/08/10 by Olaf.Piesche
#jira UE-34106
#jira UE-32784
#jira UE-31198
Reset vertex factories on mesh emitters if mesh has been reimported (if mesh package is dirty)
Change 3084645 on 2016/08/10 by Olaf.Piesche
#jira UE-30398
Fix offset added to particle collision locations.
Change 3084709 on 2016/08/10 by Daniel.Wright
Copy - Scene capture alpha is now inverted to match DrawMaterialToRenderTarget, and to allow compositing with existing render target contents
Added CompositeMode to SceneCapture2D, which can be used to addively accumulate or composite instead of the default overwrite behavior
Added bCaptureOnMovement to SceneCapture, which can be disabled so the only source of scene capturing is a manual capture by calling CaptureScene()
Change 3084783 on 2016/08/10 by Rolando.Caloca
DR - Use the first targeted rhi shader platform as the initial RHI to load on Windows
#jira UE-34510
Change 3084958 on 2016/08/10 by Daniel.Wright
Copy - Reverted cl 2938543 "Lightmass now respects owner bHidden, and bCastHiddenShadow" because it did not have backwards compatibility so breaks content using hidden light cards
Change 3086023 on 2016/08/11 by Marcus.Wassmer
Merging //UE4/Dev-Main@3085468 to Dev-Rendering (//UE4/Dev-Rendering)
#test none
Change 3086778 on 2016/08/11 by Ben.Woodhouse
Workaround for fortnite character rendering issue. Enable checkerboard rendering by default until we can fix properly
#jira UE-34561
Change 3087404 on 2016/08/12 by Rolando.Caloca
DR - Upgrade glslang to 1.0.21.1
- Added some more debug output
Change 3087524 on 2016/08/12 by Rolando.Caloca
DR - vk - Fixed StencilRef, fixed size of RHIReadSurfaceFloatData (but still returns dummy data)
Change 3087663 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix for SRGB; support for mip texture views
Change 3087735 on 2016/08/12 by Daniel.Wright
TextureRenderTarget2D's can now be up to 8192^2. Anything over 2048 pops up an 'are you sure' dialog.
Change 3087750 on 2016/08/12 by Rolando.Caloca
DR - vk - Minor renaming in prep for merge
Change 3087813 on 2016/08/12 by Rolando.Caloca
DR - vk - More minor cleanup
Change 3087819 on 2016/08/12 by Chris.Bunner
Check material function input types directly, no need to traverse connected graph.
#jira UE-32134
Change 3087901 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix RT view to use 1 mip
Fix depth buffer component swizzle
Change 3088193 on 2016/08/12 by Daniel.Wright
DFAO and RTDF shadows are enabled in High and Epic scalability settings by default
Change 3088988 on 2016/08/15 by Rolando.Caloca
DR - Add Accessors
Change 3089104 on 2016/08/15 by Olaf.Piesche
#jira UE-34241
Sceneproxy can be nullptr in FDynamicMeshEmitterData::Init if the proxy is being recreated
Change 3089208 on 2016/08/15 by Daniel.Wright
Downsampled separate translucency uses a separate view uniform buffer with correct buffer sizes
* Fixes WorldPosition in downsampled translucency
* View uniform buffer parameters are now cached on the view, to allow recreating the uniform buffer without having to rebuild the entire struct. Currently used by global distance field, downsampled separate translucency.
* Fixed the downsampled translucency depth buffer being full res used together with a smaller color target, now they are both the downsampled res
Change 3089209 on 2016/08/15 by Daniel.Wright
Fixed atmospheric fog on translucency
Change 3089457 on 2016/08/15 by Daniel.Wright
Fixed lighting build failure from UMaterialInstanceDynamic assigned to a mesh that's being exported to Lightmass. The Swarm cache entry is created using the parent's guid, causing multiple MID's with the same parent to acquire a file handle multiple times which fails after the first.
Change 3089549 on 2016/08/15 by Daniel.Wright
UMaterialInterface initializes LightingGuid to something valid - causes UMaterialInstanceDynamic to have a valid LightingGuid so they can be used in lighting builds
Change 3089703 on 2016/08/15 by Daniel.Wright
Custom expression fixup for View.RenderTargetSize
Change 3090546 on 2016/08/16 by Daniel.Wright
Hopeful fix for recycled snapshot view crash
Change 3091202 on 2016/08/16 by Daniel.Wright
Manually clear FViewInfo::CachedViewUniformShaderParameters on creating a snapshot, since memcpy is used to create the snapshot view
[CL 3091931 by Gil Gribb in Main branch]
2016-08-17 11:38:13 -04:00
}
2020-09-15 11:03:59 -04:00
ViewUniformShaderParameters . GlobalDistanceFieldMipFactor = 1.0f ;
ViewUniformShaderParameters . GlobalDistanceFieldMipTransition = 0.0f ;
2020-09-08 17:44:06 -04:00
ViewUniformShaderParameters . GlobalDistanceFieldClipmapSizeInPages = 1 ;
2022-02-02 07:59:31 -05:00
ViewUniformShaderParameters . GlobalDistanceFieldInvPageAtlasSize = FVector3f : : OneVector ;
2022-03-01 21:07:45 -05:00
ViewUniformShaderParameters . GlobalDistanceFieldInvCoverageAtlasSize = FVector3f : : OneVector ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 4041614)
#lockdown Nick.Penwarden
============================
MAJOR FEATURES & CHANGES
============================
Change 3774677 by Arne.Schober
DR - Deprecated SetLocal from the RHICmdlist
Fixed some unnecessary PSO collisions.
Change 3809579 by Chris.Bunner
Back out changelist 3774677.
#jira UE-53483
Change 3810363 by Mark.Satterthwaite
More random fixes to mtlpp: most important is the extension to Buffer that allows creation of sub-buffers that are merely views onto a sub-range of the parent. These sub-buffers are valid to use throughout the mtlpp API with two exceptions: they may not be used for visibilityResultsBuffers and Set*BufferOffset functions cannot take this offset into account (as the encoder does not hold onto the buffers and I don't want it to). In the case of Set*BufferOffset the caller has to know what is going on and in the case of visibilityResultsBuffers it'll just assert as it isn't sensible.
This makes it *much* easier to do things like sub-buffer allocation, though the caller must be aware of the alignment restrictions of their intended usage as they are not possible to enforce. For example, a call to SetVertexBuffer requires an offset alignment must match the alignment of the data-type in the shader for "device" resources, or for "constant" data it must be max(4, sizeof(datatype)) on iOS and 256 on macOS. This should allow for much more tightly packed sub-allocations than earlier approaches, though older drivers (e.g. Mac OS X 10.11) enforce only the coarser "constant" data restriction everywhere.
Change 3810407 by Marcus.Wassmer
PR #4322: ShadowSetup Bug Fix: Only stencil mask drawn meshes (Contributed by DSDambuster)
Change 3810676 by Guillaume.Abadie
Makes r.Test.SecondaryUpscaleOverride work with any arbitrary pixel size.
Change 3810696 by Guillaume.Abadie
Adds support for #include "../MyFile.ush" in the shader compiler.
Change 3810698 by Guillaume.Abadie
Implements enum class based shader permutation dimension.
Change 3810699 by Guillaume.Abadie
Implements Diaphragm DOF ground work.
Change 3811536 by Guillaume.Abadie
Pulls the trigger on CircleDOF's setup pass for DiaphragmDOF.
Change 3811958 by Mark.Satterthwaite
More fixes for mtlpp.
Change 3811964 by Mark.Satterthwaite
Only views onto a mtlpp::Buffer should return a valid parent-buffer.
Change 3812604 by Guillaume.Abadie
Changes Diaphragm DOF's source file layout.
Change 3812827 by Mark.Satterthwaite
More missing/broken functionality in mtlpp fixed and fixed obvious leaks.
Change 3812920 by Guillaume.Abadie
Adds support for per mip level UAV in FSceneRenderTarget.
Change 3812926 by Mark.Satterthwaite
Change the way we handle mtlpp resource construction to avoid leaks.
Change 3812960 by Rolando.Caloca
DR - vk - Disable DFGI
Change 3812968 by Rolando.Caloca
DR - Linker fix
Change 3813318 by Mark.Satterthwaite
Fix linear texture allocation from a buffer sub-view.
Change 3813326 by Mark.Satterthwaite
Fix another Metal mtlpp sub-buffer allocation failure.
Change 3813328 by Guillaume.Abadie
Removes global samplers in TAA for GL4, Vulkan and Switch.
Change 3813937 by Rolando.Caloca
DR - Fix logs not getting dumped when r.DumpSCWQueuedJobs is on
Change 3813947 by Rolando.Caloca
DR - noshaderworker should override r.XGEShaderCompile
Change 3817017 by Uriel.Doyon
Fixed texture editor black screen
#jira UE-53653
Change 3818568 by Rolando.Caloca
DR - Fix log when shader jobs crash
- Move log10 to common
- Added COMPILER_VULKAN define
Change 3818603 by Uriel.Doyon
Fix to static analysis warning
Change 3818623 by Rolando.Caloca
DR - Workaround hlslcc loop unrolling bug
Change 3819070 by Uriel.Doyon
Fix to stat duplication.
Change 3819105 by Uriel.Doyon
Refactored volume sample shader to avoid using texture dimension.
Change 3819136 by Rolando.Caloca
DR - vk - Per platform files (empty)
Change 3819180 by Rolando.Caloca
DR - vk - Move defines out of config into per platform
Change 3819247 by Rolando.Caloca
DR - vk - Remove more defines into platform settings
Change 3819318 by Rolando.Caloca
DR - vk - Fixes for linking
Change 3819868 by Rolando.Caloca
DR - vk - Linux & Android fixes
Change 3819873 by Guillaume.Abadie
Adds support for PermutationId on r.DumpShaderDebugInfo=1
Change 3819940 by Rolando.Caloca
DR - vk - Fix Linux issues
Change 3819956 by Rolando.Caloca
DR - vk - Invalid check
Change 3819961 by Michael.Lentine
Hide attributes when plugin is not present
Change 3819980 by Rolando.Caloca
DR - vk - Standard validation always
Change 3820039 by Rolando.Caloca
DR - vk - Fix invalid ensure
Change 3820326 by Rolando.Caloca
DR - vk - Linux compile fix
Change 3820422 by Michael.Lentine
Add back GBufferAO.
Change 3820433 by Rolando.Caloca
DR - Fix D3D12 crash on 20 thread (10x2 cores) machines
Change 3821677 by Rolando.Caloca
DR - vk - Win32 compile fix
Change 3821961 by Rolando.Caloca
DR - Vulkan uses real UB by default on non-Android
Change 3821968 by Rolando.Caloca
DR - vk - Update glslang 1.0.65.1
Change 3821969 by Uriel.Doyon
Added support for stat groups that must be sorted by name. Defined by DECLARE_STATS_GROUP_SORTBYNAME.
Change 3821983 by Rolando.Caloca
DR - vk - Change to static array (0.1ms on 10k draw calls)
Change 3824141 by Rolando.Caloca
DR - vk - Fix static analysis
- Bumped up some (c) 2017->2018
Change 3824355 by Rolando.Caloca
DR - vk - Accessor to find out if a cmd buffer has been submitted
Change 3824420 by Rolando.Caloca
DR - Sanity check number of queries per batch on D3D11 as to not break other RHIs
Change 3824463 by Rolando.Caloca
DR - Removed dummy ensure for D3D12
Change 3824609 by Rolando.Caloca
DR - vk - Linux compile fix
Change 3826074 by Mark.Satterthwaite
Start IMP-caching the various descriptor types in mtlpp.
Change 3826098 by Rolando.Caloca
DR - vk - Dump layer compile fixes
Change 3826113 by Rolando.Caloca
DR - vk - Missing dump functions
Change 3826302 by Rolando.Caloca
DR - vk - Compile fix
- Change dump handles to %p
Change 3826635 by Mark.Satterthwaite
Forward declarations required for mtlpp compilation without exposing Metal headers - plus fixes to the mtlpp test compiler.
Change 3827072 by Mark.Satterthwaite
Switch some more mtlpp descriptors over to IMPTables from objc_msgSend.
Change 3827909 by Guillaume.Abadie
Replaces diaphragm DOF's prefiltering with LDS bank coherent bilateral reduction, and implements 1/8 res background gathering pass.
Change 3827952 by Guillaume.Abadie
Updates copy right to year 2018 on diaphragm DOF's new files.
Change 3828055 by Rolando.Caloca
DR - vk - Rename in prep for changes
Change 3828229 by Guillaume.Abadie
Avoids to log multiple time global shader type name that have multiple permutations when verifying global shader map.
Change 3828427 by Guillaume.Abadie
Reimplements Max3x3 gathering post filtering for Diaphragm DOF with proper shader permutation.
Change 3829979 by Guillaume.Abadie
Fixes a color NaN source in diaphragm DOF's TAA pass.
Change 3830116 by Rolando.Caloca
DR - vk - Fix GPU queries/frame time on old system
- New system in place, disabled temporarily
Change 3830169 by Rolando.Caloca
DR - vk - Fix async pso creation crash
Change 3830193 by Rolando.Caloca
DR - vk - CPU RHI thread improvement
Change 3830291 by Guillaume.Abadie
Automatically lower the number of gathering rings on background half res gather pass as far CoC is getting smaller.
Change 3830300 by Rolando.Caloca
DR - vk - Static analysis fix: Split VulkanCommon.h out of VulkanConfiguration.h
Change 3830589 by Mark.Satterthwaite
In mtlpp cache the IMPTables for all the Metal @protocol's that are dependent on the MTLDevice, this avoids a mutex & map lookup. Also make all the concrete types store their IMPTable statically as it won't change.
Change 3830793 by Mark.Satterthwaite
Fix a small number of bugs introduced with the mtlpp descriptor and table caching.
Change 3831491 by Jian.Ru
Fix driver version unknown
#jira UE-53688
Change 3832335 by Rolando.Caloca
DR - vk - Change include
Change 3832550 by Rolando.Caloca
DR - vk - Occlusion query rewrite WIP
Change 3832589 by Rolando.Caloca
DR - vk - Minor refactor to pools in prep for timestamps
Change 3832618 by Rolando.Caloca
DR - vk - Do not block timestamp queries
Change 3832636 by Rolando.Caloca
DR - vk - Fix old timestamp queries
Change 3833138 by Rolando.Caloca
DR - vk - Fix timestamp queries
Change 3833249 by Rolando.Caloca
DR - vk - Test lock
Change 3833667 by Rolando.Caloca
DR - vk - Old queries wait on the RHI thread now instead of the driver (disabled)
Change 3833907 by Daniel.Wright
Fixed NextStartOffset UAV index out of bounds
Change 3833918 by Daniel.Wright
D3D12 RHI: only refcount uniform buffers if GRHINeedsExtraDeletionLatency is false, which is no longer the case for PC or Xbox. The refcounting was heavy on performance as reported by a licensee because FRHIResource uses atomics for refcounting, which is only necessary when GRHINeedsExtraDeletionLatency is disabled.
Change 3834852 by Rolando.Caloca
DR - vk - Missing file
Change 3834858 by Guillaume.Abadie
Implements r.DOF.MinimalFullresBlurringRadius
Change 3834979 by Rolando.Caloca
DR - vk - Fix
Change 3836117 by Rolando.Caloca
DR - vk - Update to 1.0.65.1
Change 3836122 by Rolando.Caloca
DR - vk - Added r.Vulkan.SubmitOcclusionBatchCmdBuffer
- Added new error codes/messages
Change 3836421 by Mark.Satterthwaite
For the purposes of debugging and conformance testing mtlpp make it possible to compile *without* the IMP cache so that we call the underlying Objective-C.
Change 3836896 by Uriel.Doyon
Fixed concurrency and exit issues around d3d12 pipeline states on windows.
Change 3837385 by Rolando.Caloca
DR - vk - Dump memory on OOM
Change 3837427 by Rolando.Caloca
DR - vk - Change some arrays to array views
Change 3837800 by Guillaume.Abadie
Implements SHADER_PERMUTATION_RANGE_INT to make contiguous integer permutations that does not start to 0.
Change 3838128 by Rolando.Caloca
DR - vk - Support for non-cached memory types
Change 3838540 by Guillaume.Abadie
Refactors Diaphragm DOF's CoC tile buffer under a single API for better maintainability.
Change 3838731 by Rolando.Caloca
DR - vk - Descriptor pools per command buffer pool (turned off)
Change 3838961 by Rolando.Caloca
DR - vk - Use ring buffer for per frame uniform buffers
- Enable descriptor pools per layout recycled per command buffer
Change 3839087 by Rolando.Caloca
DR - vk - Compile fixes for Android
Change 3839106 by Marcus.Wassmer
PR #4413: Removing unnecessary call to FString::ToLower (Contributed by gsfreema)
Change 3839252 by Mark.Satterthwaite
Fix mtlpp::Resource move operators.
Change 3839426 by Marcus.Wassmer
Duplicate 380972
Make PC GPU Benchmarks more reliable
Change 3840041 by Guillaume.Abadie
Fixes shader compilation failure in TAA with alpha channel through post processing support.
Change 3840257 by Chris.Bunner
Swapping a mul() to * in HLSLTranslator::Dot to allow scalar transformations per a UDN ticket.
Change 3840308 by Rolando.Caloca
DR - vk - Support for UB & non-UB on emulation mode
Change 3840586 by Rolando.Caloca
DR - Copy 3840577
Fix for CPUs with more than 16 cores
Change 3840671 by Rolando.Caloca
DR - vk - Copy from 3840663
Fix for layout ensure on HMD projects on Vulkan
Change 3840980 by Rolando.Caloca
DR - vk - Android compile fixes
Change 3841989 by Guillaume.Abadie
Slices Diaphragm DOF's Gather pass in multi shader files, and CFLAG_StandardOptimization flag for faster iteration time.
Change 3842216 by Guillaume.Abadie
Fixes DDOF's foreground alpha channel.
Change 3842217 by Guillaume.Abadie
Implements r.DOF.MaximalForegroundBlurringRadius
Change 3842353 by Guillaume.Abadie
Allows to disable foreground gathering with r.DOF.MaximalForegroundBlurringRadius=0
Change 3842747 by Rolando.Caloca
DR - vk - Missing use of GPoolSizeVRAMPercentage
- Support for smaller allocations if page size is not available
Change 3842791 by Rolando.Caloca
DR - vk - Use 95% of available GPU memory to handle some fragmentation
Change 3843690 by Guillaume.Abadie
Fixes diaphragm DOF's foreground after all this refactoring.
Change 3844439 by Guillaume.Abadie
Improves Coc dilate pass to make the gather pass as fast as possible, but still without artifacts caused by the fast gathering optimisation.
Change 3844946 by Mark.Satterthwaite
rd_route v1.1.1 with attached TPS approval.
For macOS function interposition which is useful for debugging and the occasional workaround.
Change 3845164 by Mark.Satterthwaite
Add LLM support for macOS, including tracking of memory allocated in Objective-C. This makes use of runtime method swizzling in the Objective-C runtime and the rd_route library I added for Richard Wallis, which allows for arbitrary runtime function interposition and allows me to hook the custom allocators used in Apple's many Objective-C frameworks on which the whole macOS edifice is built. Objective-C objects are charged to the calling scope as they are too common to impose their own without murdering frame rate.
We would need a TPS approval for an iOS function interposition library for this to work fully on iOS, if desired in the short term discarding LowLevelFree events that aren't in the map rather than asserting will workaround the problem.
Change 3845849 by Marcus.Wassmer
Fix clang and some normal refactor errors
Change 3846026 by Rolando.Caloca
DR - vk - Descriptor set allocation scheme rewrite
- Type hash for each pool
- Desc sets Pool on device
Change 3846169 by Rolando.Caloca
DR - vk - Remove old code for non-layout descriptor set pools
Change 3846205 by Mark.Satterthwaite
Disambiguate the PatchControlPointOut struct definitions in Metal tessellation shaders at Apple's suggestion to avoid a metallib gotcha.
Change 3846346 by Arne.Schober
DR - Missing Vector instructions
Change 3847037 by Arne.Schober
DR - Fix issue with GPU skincache where the offset of the clothbuffer is not relative to the offset of the actual vertexbuffer.
Fixed MorphTarget Skincache Offset mixxup
Change 3847275 by Marcus.Wassmer
Copying MGPU to Dev-Rendering (//UE4/Dev-Rendering)
Change 3847464 by Rolando.Caloca
DR - vk - Fix static analysis warning
Change 3847707 by Michael.Lentine
Only use MorphTargetOffset when the shader enables morph targets.
Change 3848533 by Richard.Wallis
Handle Metal adding FirstInstance into [[ instance_id ]] which is different to other APIs. SV_InstanceID and SV_VertexID should now have their respective base instance and base vertex ID's subtracted before use in the shader.
#jira UE-51716
Change 3848625 by Richard.Wallis
Compile Fix
Change 3848725 by Rolando.Caloca
DR - Remove use of Build/SetLocalGraphicsPipelineState
Change 3848797 by Rolando.Caloca
DR - Deprecate Build/SetLocalGraphicsPipelineState
Change 3849237 by Arne.Schober
DR - AddCustom Ver for ModelVertex Serialization
Change 3851247 by Rolando.Caloca
DR - vk - Util functions
Change 3851523 by Arne.Schober
DR - Update Reflection Comparission shot from the BuildFarm.
Change 3851859 by Rolando.Caloca
DR - vk - Skip loader
Change 3851889 by Krzysztof.Narkowicz
Removed lights with lighting channels out of tiled deferred light list. Tiled deferred lights do not support lighting channels and it's wasn't worth to add extra complexity to this shader in order support this special case.
#jira UE-51512
Change 3852181 by Rolando.Caloca
DR - vk - Linux compile fix
Change 3852547 by Uriel.Doyon
Fixed Pre-Exposure shader compilation and Temporal AA issue.
#jira UE-54276
Change 3852637 by Arne.Schober
DR - Fixing Normal Automated Test Result
Change 3853167 by Richard.Wallis
AvfPlayer - support for streaming media. Due to an operator new/delete mismatch in Apples CFNetwork - we've had to change out one of that framework allocators using rd_route to avoid the memory corruption.
#jira UE-35637
Change 3853447 by Chris.Bunner
Fixing typos.
Change 3853645 by Krzysztof.Narkowicz
Fixed light functions on subsurface materials
Removed strange code from blending between static and dynamic shadows
#jira UE-50275
Change 3853660 by Rolando.Caloca
DR - Fix OpenGL overwriting texture samplers on forward renderer
Change 3853945 by Mark.Satterthwaite
Duplicate #3831616
Fix the black ground scattering on Metal - we've had issues with the atmospheric fog calculations for a long time - one or more intermediate operations generates different precision on Metal so we end up passing -ve values into sqrt which then generates NaN/INF. For Metal when compiling this file and this file only #define sqrt() to sqrt(abs()) so that we don't see anymore unexpected black in atmospheric rendering. This is far from ideal but I don't want to make abs all inputs into every sqrt because AFAIK this is the only case where we have an issue, and until we to investigate each intermediate calculation that isn't ridiculously, soul-crushingly tedious, it isn't practical to identify the source of the error.
#jira UE-53720
Change 3853966 by Mark.Satterthwaite
Duplicate #3835852
Fix tessellation shaders in Metal with Manual Vertex Fetch enabled:
- The control points idnex buffer shouldn't collide with anything else.
- We can't use the optimisation of loading texture width & height from the buffer meta-table in tessellation shaders as the combined stages don't guarantee not to clobber unused buffer slots and screw it up when we use linear textures.
#jira UE-53851
Change 3854250 by Uriel.Doyon
Fix fbx automation tests
Change 3854736 by Uriel.Doyon
Added a tooltip to the EV100 slider in the exposure menu.
Using game settings now disables the slider.
#jira UE-53945
Change 3855047 by Jian.Ru
Fix DFAO getting NANs when samples out of ViewRect
#jira UE-54403
Change 3858197 by Krzysztof.Narkowicz
View frustum shadow caster culling for pointlights/spotlights
#jira UE-54381
Change 3860081 by Krzysztof.Narkowicz
Tighter bounding sphere for a spotlight
Replaced IntersectSphere(LightProxy->Origin, LightProxy->Radius) with LightProxy->SphereBounds for tighter culling of spotlights
Directional light GetBoundingSphere() now everywhere returns Sphere((0,0,0),HALF_WORLD_MAX) for consistency and proper SphereBounds
#jira UE-54258
Change 3860324 by Mark.Satterthwaite
Update the macOS deployment target version to 10.12 from 10.11 as we officially ended support for El Capitan a while ago. Should mean that libraries compiled for 10.12 and up won't cause link warnings.
Change 3860945 by Arne.Schober
DR - Fix not releaseing SRV on render thread for FPositionVertexBuffer, FStaticMeshVertexBuffer, FColorVertexBuffer, FStaticMeshInstanceBuffer.
#jira UE-54587
Change 3861129 by Jian.Ru
Prevent distance culled objects from casting distance field direct shadows
#jira UE-54533
Change 3861502 by Jian.Ru
Exclude distance culled objects from DFAO calculation
#jira UE-54533
Change 3862243 by Krzysztof.Narkowicz
Changed radius of a directional light's bounding sphere from HALF_WORLD_MAX to WORLD_MAX in order to encopass entire WORLD_MAX box
Change 3863476 by Krzysztof.Narkowicz
Added BuildReflections option to ResavePackages commandlet
#jira UE-54581
Change 3863717 by Rolando.Caloca
DR - vk - Missed using pipeline cache on compute PSOs
Change 3865332 by Arne.Schober
DR - Fix UE-52356 Bone Weight
Change 3866220 by Rolando.Caloca
DR - vk - Fixed GetNativeResource missing on textures
- Added support for -preferNvidia|AMD|Intel
- Added VulkanRHIBridge.h
- Minor fixes
Change 3866222 by Rolando.Caloca
DR - vk - Missed file
Change 3866951 by Krzysztof.Narkowicz
Fixed FreezeRendering on non editor builds: ComputeAndMarkRelevanceForViewParallel was calling FrozenMatricesGuard on multiple threads, reading and writing view matrices state in parallel.
#jira UE-53640
Change 3867231 by Guillaume.Abadie
Adds alpha mode to allow the tonemapper to passthrough the alpha channel for broadcast industry.
Change 3867233 by Guillaume.Abadie
Fixes a compilation failures in TAAU with r.PostProcessing.PropagateAlpha==2
Change 3867594 by Daniel.Wright
Removed EditorOnlyDefaultMaterials, which added 79s of shader compilation during startup
Added a dialog when opening the Material Editor on a Default Material, warning of advanced workflow
Preventing Material Editor Apply or Save for a Default Material when the preview material has compilation errors
Change 3870048 by Daniel.Wright
Cleaned up formatting in TranslucentRendering from merges
Change 3870106 by Krzysztof.Narkowicz
Fixed some FArchive Tell()/Seek() 64bit->32bit truncations
Change 3870211 by Rolando.Caloca
DR - vk - Added -vulkanvalidation=N/-vulkanstandardvalidation/-novulkanstandardvalidation to set validation layer behaviour from cmd line
Change 3870225 by Rolando.Caloca
DR - vk - Some platforms do not use a standard swapchain
Change 3870267 by Arne.Schober
DR - SafeRelease SRVs that might be hold by the Vertexfactories (maybe due to indirect use in GlobalResources)
Note that the VFs are not owners of the data, e.g the underlying Buffers might be released before this and this reference counting should be uneccessary
Change 3870647 by Daniel.Wright
Moved FogRendering.h to Renderer
Change 3872130 by Krzysztof.Narkowicz
Disable USE_GLOBAL_CLIP_PLANE for MATERIAL_DOMAIN_POSTPROCESS and MERIAL_DOMAIN_UI
Merging GitHub Pull request #4459
"When material domain is not needing global clip plane there is no need to generate any code involving it. This does not alter output but removes lot of code at vertex shader and pixel shaders. At least on mobile rendered was actually generating clipping code for ui materials."
#jira UE-54616
Change 3872145 by Rolando.Caloca
DR - vk - Optional SupportsMarkersWithoutExtension
Change 3872404 by Uriel.Doyon
Added some guards when streaming virtual textures.
Fixed optimized UCanvasRenderTarget2D::RepaintCanvas() to prevent resolving the texture twice.
Fixed bad mipmap generation with UCanvasRenderTarget2D.
Change 3872507 by Arne.Schober
Back out changelist 3870267
Change 3874176 by Ben.Marsh
IncludeTool: Add an flag to prevent scanning source files for exported symbols.
Change 3874935 by Krzysztof.Narkowicz
Fixed white thumbnails and other issues with sky lighting on ES3_1 path, by disabling GGX prefiltering, as mobile path doesn't have a single cubemap with all initialized mips. Instead it ping-pongs between 2 partially initialized.
#jira UE-54656
Change 3875710 by Daniel.Wright
Renamed uniform buffer member macros to be much shorter for readability
Change 3876665 by Guillaume.Abadie
Cherry-pick 3870715: Implements DOF's hybrid scatering bare bones.
Change 3876666 by Guillaume.Abadie
Cherry-pick 3871786: DOF hybrid scatering: fixes NaN source, transition to gather on close to screen edge and low intensity.
Change 3876677 by Guillaume.Abadie
Cherry-pick 3872348: Implements neighbor comparison for DOF's scattering compilation pass.
Change 3876680 by Guillaume.Abadie
Cherry-pick 3872357: Oups... fixes build...
Change 3876683 by Guillaume.Abadie
Cherry-pick 3872475: Controls number of mip to generate with DOF's reduce pass.
Change 3876687 by Guillaume.Abadie
Cherry-pick 3874104: Fixes various bugs in diaphragm DOF's hybrid scattering.
Change 3876690 by Guillaume.Abadie
Cherry-pick 3874144: Packs multiple DOF scattering group into same draw instance.
Change 3876694 by Guillaume.Abadie
Cherry-pick 3874275: Switches hybrid scattering with indexed indirect draw call to reduce scatter vertex shader invocation.
Change 3876695 by Guillaume.Abadie
Cherry-pick 3874674: Records min and max coc on DOF's setup's draw event.
Change 3876783 by Rolando.Caloca
DR - Static analysis fix
Change 3876845 by Guillaume.Abadie
Implements USceneCaptureComponent::ProfilingEventName
Change 3877197 by Rolando.Caloca
DR - vk - OQ fixes (disabled)
Change 3877428 by Krzysztof.Narkowicz
Merged with tiny tweaks Ansel photography plugin improvements from Adam Moss (GitHub pull request #4426):
-The free-roaming photography camera has new constraints by default, i.e. it can't pass through walls
-Photography session can be started and stopped programmatically, e.g. making it possible to bind photography to an alternative hotkey or button combo. This was an often-requested feature.
-Tweakables and utilities are now exposed through a Blueprint Function Library (rather than direct manipulation of console variables)
-The Ansel photography session UI now exposes some engine effect tweakables as sliders. For example, if the game is using depth-of-field then sliders are made available to allow the photographer to change the focal depth etc. The developer may suppress this behavior through the Blueprint Function Library.
-Letterboxing is now removed during multi-part capture, d'oh.
-Tiled shots are taken at full resolution even if ScreenPercentage < 100
-SSR is enabled during super-resolution shots since Ansel is now better at hiding any ensuing artifacts
-Postprocess settings are frozen at session start to avoid discontinuities during photography, i.e. wandering between postprocess volumes when the camera auto-moves for stereo and 360 shots.
#jira UE-54244
#4426
Change 3879086 by Krzysztof.Narkowicz
Fixed sky/reflection capture (without owner) update - they are now updated only with a correspoding world
Change 3879090 by Guillaume.Abadie
Fixes tones of regressions on diaphragm DOF's recombine passes.
Change 3879198 by Rolando.Caloca
DR - vk - Support for real uniform buffers on Android platforms
Change 3879993 by Krzysztof.Narkowicz
-Fixed int64->int32 FArchive offset truncation in TShaderMap, VertexFactory and TextureDerivedData
-Fixed FSerializationHistory bug, when trying to serialize 0 bytes
#jira UE-43203
Change 3881462 by Guillaume.Abadie
Implements full res DOF's setup pass for cheaper full res gathering in recombine pass.
Change 3881524 by Krzysztof.Narkowicz
Fixed compilation by removing FTickableEditorObject from FPreviewScene
Change 3881724 by Chris.Bunner
Static analysis fix.
#jira UE-54762
Change 3881861 by Rolando.Caloca
DR - vk - Fix layout warning when generating mip chain
Change 3881864 by Rolando.Caloca
DR - Use render passes on HZB
Change 3882236 by Yuriy.ODonnell
IndirectLightingColorScale is now applied to SubsurfaceLighting and DiffuseLighting. Was previously only applied to DiffuseLighting.
#jira UE-42534
#github 3326
Change 3882325 by Guillaume.Abadie
Implements FocusOnly lower gathering pass for Diaphragm DOF's slight out focus temporal stability.
Change 3882340 by Rolando.Caloca
DR - vk - Fix api dump
Change 3882430 by Rolando.Caloca
DR - vk - KHR_maintenance2
Change 3882563 by Rolando.Caloca
DR - Add depth-stencil access mode to PSO initializer
Change 3882929 by Rolando.Caloca
DR - vk - Proper fix for maintenance extension macros
Change 3883087 by Mark.Satterthwaite
Allow disabling VSync in windowed mode for macOS 10.13.4+ and above.
Change 3883597 by Guillaume.Abadie
Collapses full and half res DOF setup passes together.
Change 3883702 by Guillaume.Abadie
Fixes mac's build.
Change 3884747 by Uriel.Doyon
Fix for static analysis warning
Change 3884975 by Rolando.Caloca
DR - vk - Move some platform defines to platform properties
Change 3884988 by Rolando.Caloca
DR - vk - Make an override per platform
Change 3885832 by Rolando.Caloca
DR - vk - Cosmetic change to group similar members
Change 3885891 by Rolando.Caloca
DR - vk - Some _RenderThread functions to avoid stalls
Change 3886044 by Rolando.Caloca
DR - Added RHI api _RenderThread version of
RHICreateTextureReference
RHICreateShaderLibrary
RHICreateRenderQuery
Change 3886560 by Guillaume.Abadie
Fixes strong aliasing on TAAU's fast shader permutation.
This adds a 6th neighbor sampling, and switch AA_TONE ON as TAA does for its fast shader permutation.
Change 3886749 by Guillaume.Abadie
Cherry-pick 3884748: Implements DOF's BuildBokehLUT for diaphragm blades simulation.
Only used in hybrid scattering for now.
Change 3886750 by Guillaume.Abadie
Cherry-pick 3885457: Simulates diaphragm blades' curvature on bokeh.
Change 3886752 by Rolando.Caloca
DR - Fix metal static analysis
Change 3887460 by Uriel.Doyon
Fixed to more static analysis warning.
Change 3888201 by Rolando.Caloca
DR - vk - Added r.Vulkan.SubmitAfterEveryEndRenderPass
- Fixed bad layout on rendering back buffer
Change 3888209 by Rolando.Caloca
DR - vk - Unity compile fix
Change 3888254 by Rolando.Caloca
DR - vk - Fix async texture layout
Change 3888893 by Guillaume.Abadie
Simulates bokeh in DOF's slight out of focus.
Change 3889085 by Guillaume.Abadie
Fixes DOF's reduce pass sampling outside viewport.
Change 3889924 by Rolando.Caloca
DR - vk - Skip seemingly bad validation error
Change 3890573 by Daniel.Wright
Only initialize FDiaphragmDOFGlobalResource in Feature Level 5
Change 3890590 by Arne.Schober
DR - Fix Paper2d crash. When addMesh is called the Vertex and Indexbuffers are nulled out. re-create Dynamic Mesh builder for every Mesh instead.
#jira UE-55063
Change 3890638 by Arne.Schober
DR - Better fix for Paper2d which honors batching
#jira UE-55063
Change 3891099 by Krzysztof.Narkowicz
1.5 texel shadow offset fix inside Manual2x2PCF based on #4485 GitHub pull request
#jira UE-54985
#4485
Change 3891234 by Krzysztof.Narkowicz
Optimized PCF2x2 and PCF3x3 - merged #4494 GithHub pull request
#jira UE-55121
Change 3891407 by Rolando.Caloca
DR - vk - Set vendor id earlier
Change 3891417 by Rolando.Caloca
DR - vk - Missing layout transitions
Change 3891718 by Arne.Schober
DR - Do not recreate one Frame Resource for dynamic draws
#jira UE-55063
Change 3891925 by Yuriy.ODonnell
Fix/workaround for inconsistent preprocessor definitions for NVAftermath that result in FD3D11DynamicRHI class layout mismatch. NVAftermath support is now enabled by default for Win64.
NVAftermath is declared as a private dependency in D3D11RHI. It does not automatically propagate to modules that explicitly include private RHI headers (OculusHMD, OSVR, OSVRInput). This results in NV_AFTERMATH being defined while compiling RHI module and not defined when compiling other modules, causing memory corruption at runtime.
The long-term solution for this and similar issues requires some mechanism for adding transitive module dependencies, so that anyone that depends on D3D11RHI module would automatically also get the NVAftermath. Additionally, private headers should *never* be included directly by external modules.
The short-term solution is to explicitly add NVAftermath dependency to OculusHMD, OSVR and OSVRInput.
Additionally, NV_AFTERMATH is no longer forced by D3D11RHIPrivate.h when it's not defined. This allows catching this kind of mismatch in the future through a compiler warning (C4668).
#jira UE-53065
Change 3891987 by Rolando.Caloca
DR - vk - Support for dedicated allocations
Change 3892339 by Jian.Ru
Fix a crash when tessellation shaders are used in dx12
#jira UE-55127
Change 3892528 by Rolando.Caloca
DR - vk - Update Linux headers
Change 3892867 by Rolando.Caloca
DR - vk - Don't create swapchain if not needed
Change 3893416 by Guillaume.Abadie
Implements bokeh simmulation on foreground and background gather.
Change 3893732 by Chris.Bunner
GetRelevance_Internal should use the immediate parent resource, not the base, as some features are overridden by permutations e.g. UsesWorldPositionOffset.
#jira UE-53404
Change 3893868 by Guillaume.Abadie
Allocates diaphragm DOF's buffers and structered buffer only on supported platforms.
Change 3893917 by Chris.Bunner
Potential fix for CIS.
Change 3893933 by Chris.Bunner
Duplicating CL 2647737 as this is the same issue from that JIRA where accessing game-thread data was being prevented. We don't have this check in UMaterial::GetMaterialResource already, but presumably the UMaterialInstance case was never removed as we've not been calling it until now.
Change 3894218 by Rolando.Caloca
DR - vk - Remove stat counters per draw call, gains 10% CPU on Infiltrator
Change 3894579 by Arne.Schober
RT - Fix assert not in RenderingThread from Triangle Renderer.
#jira UE-55247
Change 3894724 by Rolando.Caloca
DR - vk - New API for batching barriers
Change 3894909 by Arne.Schober
DR - Fix crash in Speedtree wind where Renderdata is unavailable
#jira UE-54544
Change 3895414 by Rolando.Caloca
DR - Add a configurable threshold for SCWs time outs
Change 3896429 by Marcus.Wassmer
Allow variable frame-latency delay in FrameGrabber frames. For performance you want at least a 1 frame delay so you don't sync the GPU to the CPU.
Change 3896495 by Marcus.Wassmer
Set pointer properly
Fix CIS
Change 3897253 by Guillaume.Abadie
Fixes CIS warning in diaphragm DOF
Change 3899179 by Guillaume.Abadie
Implements background hybrid scatter occlusion for diaphragm DOF.
Change 3903654 by Rolando.Caloca
DR - vk - Rework dump layer to allow other layers
Change 3903766 by Rolando.Caloca
DR - vk - More wrappers
Change 3904025 by Rolando.Caloca
DR - vk - More wrappers
Change 3904342 by Rolando.Caloca
DR - vk - Track image resources & callstacks
Change 3904346 by Rolando.Caloca
DR - vk - Copy fix from 4.19 for flickering grass
Change 3904510 by Rolando.Caloca
DR - vk - Compile fix
Change 3904914 by Daniel.Wright
[Integrate] Fixed PS4 transitions with forward shading
Change 3904916 by Daniel.Wright
[Integrate] Fixed PS4 transitions with occlusion queries
Change 3905975 by Rolando.Caloca
DR - vk - Missing wrappers
Change 3905977 by Rolando.Caloca
DR - vk - Missed file
Change 3907829 by Rolando.Caloca
DR - Move depth bounds to the PSO
Change 3907832 by Rolando.Caloca
DR - vk - Prep for delaying transitions
Change 3907834 by Rolando.Caloca
DR - vk - Fix for depth stencil issues/validation errors
Change 3907967 by Rolando.Caloca
DR - vk - Linux compile
Change 3908093 by Rolando.Caloca
DR - vk - Fix depthstencil layout on descriptors
Change 3908393 by Rolando.Caloca
DR - vk - Disable dedicated allocation as it causes crashes on Nvidia 700 series
Change 3908401 by Rolando.Caloca
DR - Do transitions outside render pass
Change 3908422 by Rolando.Caloca
DR - vk - Fix transition state not getting stored
Change 3908735 by Guillaume.Abadie
Cherry-pick 3896619: Fixes after TAAU post process material that had wrong default buffer UV.
#jira UE-55317
Change 3908736 by Guillaume.Abadie
Cherry-pick 3891352: Fixes ensure when visualizing HDR with TAAU.
#jira UE-55019
Change 3908753 by Guillaume.Abadie
Lets the renderer layout the views in the internal render targets like it prefers.
Change 3909119 by Daniel.Wright
Fix some static analysis warnings
Change 3911943 by Rolando.Caloca
DR - vk - Fix for packaging Vulkan projects
Change 3912145 by Rolando.Caloca
DR - vk - Fix layout on streaming textures
Change 3913029 by Rolando.Caloca
DR - Fix missing transition
Change 3913048 by Rolando.Caloca
DR - Fix for hlslcc
Change 3913054 by Rolando.Caloca
DR - vk - Fix number of layers on barrier
Change 3913171 by Rolando.Caloca
DR - vk - Fix for decal missing transition
Change 3913211 by Rolando.Caloca
DR - vk - Add debug name to image tracking
Change 3913449 by Rolando.Caloca
DR - vk - Restore transition
Change 3913466 by Rolando.Caloca
DR - Fix Vulkan EngineTest
Change 3913537 by Rolando.Caloca
DR - vk - Fixes independent samplers & textures (contributed by AMD)
Change 3913548 by Rolando.Caloca
DR - vk - Warning fix
Change 3913691 by Rolando.Caloca
DR - vk - Fixes for parallel (wip)
Change 3914656 by Rolando.Caloca
DR - vk - Fix bug when using separate samplerstates and textures
Change 3914730 by Rolando.Caloca
DR - vk - Bump version
Change 3914764 by Rolando.Caloca
DR - vk - Don't crash on exit
Change 3915532 by Rolando.Caloca
DR - vk - Parallel context fixes
Change 3915589 by Rolando.Caloca
DR - vk - Hoist and rename transition and layout manager class out of the context
Change 3915592 by Rolando.Caloca
DR - Fix gpu marker name
Change 3917607 by Rolando.Caloca
DR - vk - Fix depth bounds on Vulkan
Change 3917609 by Rolando.Caloca
DR - vk - Fix static analysis
Change 3917616 by Rolando.Caloca
DR - Fix D3D11 initialization
Change 3920569 by Rolando.Caloca
DR - vk - Prep for layout mgr refactor
Change 3921023 by Rolando.Caloca
DR - vk - Dump layer fixes
Change 3921623 by Rolando.Caloca
DR - vk - Prep refactor for layouts
- Dump now shows marker tree
Change 3922007 by Rolando.Caloca
DR - vk - Fix extra allocation per draw call
Change 3922442 by Rolando.Caloca
DR - vk - Detect potential issues
Change 3922470 by Rolando.Caloca
DR - vk - Minor optimization
Change 3922482 by Rolando.Caloca
DR - vk - More minor optimizations
Change 3923158 by Rolando.Caloca
DR - Move r.DisableEngineAndAppRegistration out to common RHI and use it on Vulkan
Change 3923486 by Rolando.Caloca
DR - vk - Minor cpu optimizations
Change 3923505 by Rolando.Caloca
DR - vk - Use bigger allocations for uniform buffers
Change 3923516 by Rolando.Caloca
DR - vk - Android compile fix
Change 3923557 by Rolando.Caloca
DR - vk - Cache descriptorset layouts, refactor duplicated code
Change 3923851 by Rolando.Caloca
DR - vk - Linux compile fix
Change 3924153 by Rolando.Caloca
DR - vk - Support for dynamic UBs
Change 3924193 by Rolando.Caloca
DR - vk - Remove old per pso descriptor pools
Change 3924197 by Rolando.Caloca
DR - vk - Remove unused global uniform buffer pool
Change 3924220 by Rolando.Caloca
DR - vk - Wrap some unused classes in their define
Change 3924234 by Rolando.Caloca
DR - vk - Show ring buffer wrapping messages
Change 3924243 by Rolando.Caloca
DR - vk - Fix bad dynamic buffer
Change 3924902 by Rolando.Caloca
DR - vk - Fix crash running infiltrator
Change 3925209 by Rolando.Caloca
DR - vk - Fix bug with dynamic buffers
- Remove old defines
Change 3925300 by Rolando.Caloca
DR - vk - Allow packed uniforms as dynamic UBs (with r.Vulkan.DynamicGlobalUBs)
Change 3925627 by Rolando.Caloca
DR - vk - Move DynamicOffsets into the pipeline state
Change 3925834 by Rolando.Caloca
DR - vk - Cache per stage information
Change 3925835 by Daniel.Wright
Fixed DisplayName for UParticleModuleCollisionGPU
Change 3925897 by Rolando.Caloca
DR - vk - Split update descriptors loop
Change 3926488 by Rolando.Caloca
DR - vk - 16MB for ring buffer on desktop, 8 MB for mobile
Change 3928168 by Guillaume.Abadie
Cherry-pick 3917219: Implements r.DOF.RecombineQuality
Change 3928173 by Guillaume.Abadie
Cherry-pick 3927888: Enables r.DOF.HybridScatter.BackgroundCompositing and r.DOF.HybridScatter.ForegroundCompositing to work when both enabled.
Change 3928216 by Rolando.Caloca
DR - vk - Fix Android
- Fix static analysis
Change 3929119 by Rolando.Caloca
DR - vk - Rename some classes for clarity
- Fix read-only cvar
Change 3929151 by Rolando.Caloca
DR - vk - Rename class
Change 3930046 by Rolando.Caloca
DR - Temp fix Vulkan flickering grass
Change 3930148 by Rolando.Caloca
DR - vk - Only update dirty descriptors
- Use dynamic descriptors for packed global uniform buffers
Change 3930998 by Guillaume.Abadie
Packs shader permutation in different XGE submissions.
Change 3931079 by Rolando.Caloca
DR - vk - Fixes for Android and non-real ubs platforms
Change 3931942 by Krzysztof.Narkowicz
Depth rendering - When EarlyZPassMode is set to DDM_AllOccluders, dynamic objects need also to test bUseAsOccluder just like static ones
#jira none
Change 3932819 by Daniel.Wright
[Integrate] Scene Textures uniform buffer
* Base Pass Uniform Buffer now contains a Scene Textures uniform buffer. Previously the translucent base pass had to check ~40 loose scene texture parameters every draw.
* FMeshMaterialShader's must now bind PassUniformBuffer and supply a valid pass uniform buffer. For most passes this is just FSceneTextureUniformParameters.
* FRendererModule::DrawTileMesh can now cleanly set dummy scene texture resources, just by configuring how the pass uniform buffer is created.
* Moved scene texture shader functions out of Common, into SceneTexturesCommon which must be manually included by shaders that want to use them
* Separate Mobile Scene Textures uniform buffer to silo the platform complexities
Moved DBuffer inputs out of FDeferredPixelShaderParameters and into FOpaqueBasePassUniformParameters
Removed per-frame material uniform expressions. GameTime material node with period is now implemented with an fmod in the shader, without the use of MaterialFloat, so that it will happen at full precision.
* Per-frame expressions were used when the GameTime material node had a period, to do the fmod on the CPU where 32 bit precision is guaranteed, for mobile GPU's where pixel shader precision is sometimes less than 32fp.
Moved forward shading data into the Base Pass Uniform Buffer
Removed instanced stereo support for the light cull grid - will have to be reimplemented without changing SRV's per draw
Base pass sets View Uniform Buffer from DrawRenderState instead of choosing which one to set per-draw
Fixed padding in nested uniform buffer structs
Skip SRV members on Feature Level SM4 and below
Change 3932964 by Rolando.Caloca
DR - vk - Renderdoc on Android
Change 3933095 by Daniel.Wright
Moved FSceneTextureUniformParameters out of the opaque base pass uniform buffer.
* Base Pass shaders now enable SCENE_TEXTURES_DISABLED when compiling for a material of any domain other than MD_Surface. These are used when rendering thumbnails of a material in a different domain, which could be opaque, but the opaque base pass drawing policy does not bind a scene textures uniform buffer, so the shader must not bind it.
* Opaque materials can no longer use EyeAdaptation.
Change 3933096 by Daniel.Wright
Better d3d11 assert message when a uniform buffer was not set by the renderer
Change 3933176 by Rolando.Caloca
DR - vk - Prefer mailbox if available
Change 3933271 by Ryan.Vance
#jira UE-55936
Fixed missing referenced uniform bindings on AR pass-through camera shaders.
Change 3934000 by Guillaume.Abadie
Fixes Win32 build in ShaderCompilerXGE.cpp
Change 3934299 by Guillaume.Abadie
Fixes a bug in DOF's reduce operator that was casusing color leaking between background and foreground.
Change 3934699 by Daniel.Wright
Added bAffectDistanceFieldLighting to landscape
Change 3935190 by Daniel.Wright
Forward Light Grid SRV's use StructuredBuffer on Metal, instead of 'invariant Buffer', which throws off RemoveUniformBuffersFromSource parsing
Change 3935606 by Daniel.Wright
Removed LightmapPolicy::Set which was needed for vertex lightmaps
Renamed FVertexFactory::Set to SetStreams to make it findable
Change 3936510 by Rolando.Caloca
DR - vk - Update glslangValidator.exe to 1.0.65.1 for dumped debug SPIRV shaders
Change 3936545 by Richard.Wallis
Clone of CL's (3925763, 3925430, 3925424, 3925385, 3925278) Mark Satt's Xcode fixes from task stream //Tasks/UE4/Dev-UERNDR-354-mtlpp/
Plus XCode 9.2 compile fix in ApplicationPlatformCompilerPreSetup.h for -Wunused-lambda-capture.
Change 3938061 by Daniel.Wright
Vulkan: Added support for SRV's in Uniform Buffers
Change 3938123 by Daniel.Wright
Vulkan: Slightly better assert for null resources in uniform buffer
Change 3939197 by Rolando.Caloca
DR - vk - Disable custom memory mgmt
Change 3939677 by Rolando.Caloca
DR - vk - Fix static analysis warning
Change 3939809 by Rolando.Caloca
DR - vk - Fixes for async compute
Change 3939875 by Rolando.Caloca
DR - vk - Support for -vktrace
Change 3939977 by Rolando.Caloca
DR - vk - Skip a condition during gather UBs
- Set up efficient compute async var
- Fix validation cmd line
Change 3939982 by Rolando.Caloca
DR - vk - Revert mipchain
Change 3939984 by Rolando.Caloca
DR - vk - Remove unnecessary asserts
Change 3940082 by Rolando.Caloca
DR - vk - Custom mem mgr
Change 3940475 by Rolando.Caloca
DR - vk - Fix DFAO (indirect draw offset)
Change 3940555 by Rolando.Caloca
DR - vk - Minor fixes
Change 3940675 by Rolando.Caloca
DR - vk - Fix indirect type mismatch
Change 3941111 by Rolando.Caloca
DR - Renderpass bGeneratingMips
Change 3941847 by Daniel.Wright
Fixed Volumetric Lightmaps on Static geometry only working if the geometry had been built with Surface Lightmaps before
Change 3941978 by Rolando.Caloca
DR - vk - Minor fixes for presenting on compute queue
Change 3942074 by Rolando.Caloca
DR - vk - Remove some RHI stalls
- Fixed swap chain stat
Change 3943946 by Daniel.Wright
Fixed Texcoord0 on Volume materials on a particle sprite, including SubUV particles.
Change 3944065 by Daniel.Wright
Fixed SceneDepth collision getting broken on GPU particles when a scene capture is rendering
Change 3944158 by Daniel.Wright
Fixed ViewUniformShaderParameters accessing GEngine->PreIntegratedSkinBRDFTexture too early during slate loading screen
Change 3944865 by Rolando.Caloca
DR - vk - Prep for render passes
Change 3945196 by Rolando.Caloca
DR - Move render pass validate to cpp
Change 3945202 by Rolando.Caloca
DR - vk - Some fixes for using real render passes
Change 3945357 by Rolando.Caloca
DR - Fix bad condition
Change 3946295 by Yuriy.ODonnell
Added a sentinel member to FLightMap, which is initialized in the ctor and reset in the dtor. Sentinel is then checked in FLightCacheInterface::GetLightMapInteraction().
This aims to shed some more light on a hard-to-repro crash, which is suspected to be a use-after-free bug: http://crashreporter/Buggs/Show/1785593
Change 3946407 by Rolando.Caloca
DR - vk - Prep for refactor
Change 3946648 by Rolando.Caloca
DR - vk - Fixes for async compute (wip)
Change 3947299 by Rolando.Caloca
DR - vk - FIx static analysis
Change 3948434 by Rolando.Caloca
DR - vk - Fix exiting with parallel
Change 3948928 by Rolando.Caloca
DR - vk - Fix enabling draw markers for tools
Change 3949021 by Rolando.Caloca
DR - vk - Buffer tracking layer
Change 3949602 by Rolando.Caloca
DR - vk - static analysis fix
Change 3949757 by Rolando.Caloca
DR - vk - Remove bogus parameter
Change 3949810 by Rolando.Caloca
DR - vk - Move waits for cmd buffer
Change 3950270 by Guillaume.Abadie
Implements dedicated gather pass for foreground hole filling to avoid being VGPR bound in foreground gather pass, but still being hable to amend foreground.
Change 3950272 by Rolando.Caloca
DR - vk - Minor refactor for semaphores
Change 3950279 by Guillaume.Abadie
Oups... fixes build
Change 3950298 by Rolando.Caloca
DR - vk - Gather wait semaphores in the cmd buffers
Change 3950371 by Rolando.Caloca
DR - vk - fixes for async compute
Change 3950597 by Rolando.Caloca
DR - vk - Fix for clip distance (fixes planar reflections)
Change 3951075 by Rolando.Caloca
DR - vk - Fix for async compute
Change 3952524 by Guillaume.Abadie
Some DOF enum refactoring.
Change 3955016 by Daniel.Wright
Fixed BuiltData package getting renamed into the map package during a content browser folder move, causing a redirector to be incorrectly placed in the map package
Change 3955668 by Guillaume.Abadie
Fixes a bug where full res coc buffer was computed even if not doing slight out of focus.
Change 3956722 by Guillaume.Abadie
Fixes a bug where r.DOF.MaximalForegroundBlurringRadius was screen percentage dependent.
Change 3959212 by Guillaume.Abadie
Prefixes all DOF's shaders files with DOF keyword.
Change 3959705 by Guillaume.Abadie
Optimises the DOF setup pass outputing half res and full res with LDS downsample.
Change 3959941 by Guillaume.Abadie
Halfs DOF's hybrid scatter compilation by using a unique downsampling for both foreground and background, instead of 2 reduce passes.
Change 3962273 by Rolando.Caloca
DR - Fix typos
#jira UE-56317
PR #4586
Change 3962615 by Rolando.Caloca
DR - vk - Compile fix
Change 3962949 by Rolando.Caloca
DR - Fix DOFDownsample extension
Change 3962993 by Guillaume.Abadie
Back out changelist 3962949
Change 3963016 by Guillaume.Abadie
Adds missing DOFDownsample.usf
Change 3963041 by Rolando.Caloca
DR - vk - Misc changes to help integrate
Change 3964293 by Guillaume.Abadie
Fixes DOF's setup pass reading outside of the viewport.
Change 3964475 by Guillaume.Abadie
Collapses DOF's hybrid scatter compilation passes into reduce passes.
Change 3964883 by Daniel.Wright
Fixed 3d texture in uniform buffer on unsupporting RHI
Change 3964897 by Rolando.Caloca
DR - Compile fixes
Change 3964914 by Guillaume.Abadie
Fixes a bug on r.DOF.RecombineQuality=0
Change 3965153 by Guillaume.Abadie
Fixes compile warning in D3D12Commands.cpp.
Change 3965814 by Rolando.Caloca
DR - Prep for integration conflict resolve
Change 3965899 by Rolando.Caloca
DR - Fix odd linkage issue
Change 3966072 by Rolando.Caloca
DR - More prep for merge
Change 3966163 by Rolando.Caloca
DR - Merge prep
Change 3966844 by Guillaume.Abadie
Packs multiple DOF scattered bokeh per instance and uses PT_RectList in DOF for platforms that can.
Change 3967116 by Rolando.Caloca
DR - Compile fixes for integration
Change 3967273 by Rolando.Caloca
DR - Use same path for mip generation
Change 3967277 by Rolando.Caloca
DR - vk - Fix mips on cubemaps
Change 3967693 by Rolando.Caloca
DR - Copying //UE4/Dev-Main@3912313 to //UE4-DevRendering, missing shaders
Change 3967851 by Rolando.Caloca
DR - Copying //UE4/Dev-Main@3912313 to //UE4-DevRendering, Engine 2/2
Change 3968083 by Rolando.Caloca
DR - Integration compile fixes
Change 3968240 by Rolando.Caloca
DR - Shader compile fixes for integration
Change 3968270 by Rolando.Caloca
DR - Fix for missing hash calculation
Change 3969426 by Rolando.Caloca
DR - vk - Fix warning
Change 3969869 by Krzysztof.Narkowicz
Back out changelist 3946295 - UE-54537 is fixed, so no need for this debug sentinel.
#jira none
Change 3969944 by Rolando.Caloca
DR - Warning fix
Change 3970020 by Rolando.Caloca
DR - Bump after integration
Change 3970052 by Rolando.Caloca
DR - Fix for mobile
Change 3970236 by Daniel.Wright
Causing decal shader to recompile to fix a merge bug
Change 3970270 by Daniel.Wright
Bump shader version from merge
Change 3970339 by Olaf.Piesche
Replace series of locks/unlocks with a single one for curve injection
#tests QAGame
Change 3970390 by Rolando.Caloca
DR - Rename FSceneTextureUniformParameters to FSceneTexturesUniformParameters
- Remove duplicate method for occlusion queries
Change 3970523 by Rolando.Caloca
DR - Fix serialization of shaders
Change 3970533 by Arne.Schober
DR - fix for removing the Speed tree wind when the scene gets deleted. The original enque rendercommand requeues the element onto the renderthread although the call already came from the Renderthread and the scene can get lost in between.
#jira UE-56322
Change 3971160 by Guillaume.Abadie
Fixes CompositeEditorPrimtive pass and SelectionOutline pass for VR editor to work with TAAU.
Change 3971516 by Guillaume.Abadie
Cherry-pick 3912629: Fixes SSR that was computing vigneting according to PrevScreen that could let some outside viewport samples going through when rotating the camera.
#jira UE-55353
Change 3971594 by Krzysztof.Narkowicz
Fixed assert inside BindLightMapVertexBuffer. FSplineMeshSceneProxy was calling BindLightMapVertexBuffer for invalid (still not generated) lightmap UV channel after mesh reimport. Simplified assert, as at the moment almost all of the high callsites already clamp lightmap uv channel.
#jira UE-56321
Change 3971622 by Krzysztof.Narkowicz
Fixed crash inside Indirect Lighting Cache. Data (reflection captures and lightmap) generation calls ULevel::GetOrCreateMapBuildData(), which can destroy lightmap data if level has legacy data. Last Lightmap generation step recreates this data, but if user cancels lightmap generation - it won't do that.
#jira UE-56171
Change 3974788 by Rolando.Caloca
DR - Remove GSupportsGenerateMips
Change 3974789 by Rolando.Caloca
DR - Remove bogus function
Change 3974986 by Rolando.Caloca
DR - vk - Tracking fixes
Change 3974989 by Rolando.Caloca
DR - vk - Don't submit dummy barriers
Change 3975075 by Olaf.Piesche
Update for particle curve injection improvement, fixing ES2 problems
#tests QAGame tm-shadermodels, various color curve tests in-editor
Change 3975957 by Uriel.Doyon
Fixed invalid max texture resolution when using the bake material tools.
Change 3978471 by Daniel.Wright
New cvar r.SkylightUpdateEveryFrame
Change 3978779 by Rolando.Caloca
DR - Accessor for texture sizes
Change 3978797 by Rolando.Caloca
DR - Clean up RHI CopyTexture API
Change 3978832 by Rolando.Caloca
DR - vk - Workaround for RenderDoc crashing due to Descriptor Pool reset
Change 3978836 by Rolando.Caloca
DR - vk - Remove generate mips
Change 3979201 by Rolando.Caloca
DR - vk - RHI CopyTexture. Uses general layout for generating mips
Change 3979204 by Rolando.Caloca
DR - Use render passes and CopyTexture to generate mips
Change 3979592 by Rolando.Caloca
DR - Warning fix
Change 3980855 by Krzysztof.Narkowicz
Optimize bounding sphere radius after non-uniform scale by using bounding box extent.
#jira UE-56227
Change 3981065 by Rolando.Caloca
DR - vk - Fix bad layout
#jira UE-56238
Change 3981346 by Rolando.Caloca
DR - Copy from 3707257
Support for not flushing compute jobs (r.D3D11.UAVFlushNV)
Change 3981347 by Rolando.Caloca
DR - Copy from 3707257
Don't flush between morph dispatched
Change 3981932 by Mark.Satterthwaite
Generate the shader hash and function name when a Metal shader error needs to be reported so that even without shader code we get something to go on.
Change 3982442 by Rolando.Caloca
DR - Fix warning
Change 3982652 by Rolando.Caloca
DR - vk - Signal semaphore cleanup
Change 3983917 by Richard.Wallis
Clone of CL 3974146 converted for mtlpp along with extra mtlpp usage suggestions by Mark Satt:
Fix for black flickering on first paint with weighted material landscape on Mac. When using AsyncCopyFromBufferToTexture in Metal we put the blit operation on the prologue encoder - however after a draw call using that resource the copy operation should happen after on the current encoder, this keeps the correct order of operations.
Added Bool return from various Asnyc renderpass resource requests so caller can decide correct further action. Updated to include the other async functions.
Change 3984409 by Guillaume.Abadie
Attempts to make static analysis happy again.
Change 3984435 by Nick.Bullard
Checking in Performance Test level provided to us by Tor Frick based on UE-44841.
This has been utilized for checking issues against Aftermath performance impact.
The Map includes 2 Level Book marks, most testing has been done against Bookmark 1 view, in fullscreen, in game mode
Change 3985087 by Mark.Satterthwaite
Make sure that the particle scratch buffer is large enough to hold all the data for the curve texture we are rendering to, otherwise a full set of curves will start scribbling memory after 64Kb (the curve texture is 256Kb of data - 512x512x4 as sizeof(RGBAUInt8) == 4). This happens in ElementalDemo.
Change 3985201 by Rolando.Caloca
DR - Fix bad CopyTexture
Change 3985258 by Mark.Satterthwaite
Try and detect orientation changes so that we don't blow-up on iOS due to a huge mismatch between the drawable texture for the display and the scene's depth-stencil target. I can't just fiddle with the depth-stencil texture itself without running the risk of obliterating in-use data and really we shouldn't permit such a mismatch anyway but it is fallout from 3620990.
#jira UE-55756
Change 3986449 by Rolando.Caloca
DR - vk - Update & consolidate Vulkan headers to 1.1.70.1
Consolidate SDK into one
Change 3986571 by Guillaume.Abadie
Makes PVS-Studio happy again in DOF.
Change 3987039 by Yuriy.ODonnell
Initial implementation of tracing profiler to show CPU and multiple GPUs on the same timeline. Currently only supported on DX12 platforms.
Use `TracingProfiler frames=N` console command to trigger a capture of the next N frames. Trace is saved to disk as a JSON file into `Saved/Profiling/Traces` directory.
Trace file uses Google Tracing format and can be visualized in Chrome built-in profiler (chrome://tracing).
`r.GPUStatsChildTimesIncluded=1` CVar makes timing scopes hierarchical.
`TracingProfiler.BufferSize=N` CVar controls the size of the tracing buffer, which may need to be increased for long traces (default is 65k events). Only can be set at startup.
Change 3987074 by Yuriy.ODonnell
Implemented timestamp calibration on DX11. Calibration is only performed when tracing profiler session starts.
Change 3987160 by Yuriy.ODonnell
Added thread naming and ordering to the tracing profiler output
Change 3987331 by Mark.Satterthwaite
Remove the Nvidia hack to retain resource references in command-buffers for UE-46604 as the mtlpp refactor provides stronger resource lifetime guarantees.
#jira UE-46604
Change 3987754 by Mark.Satterthwaite
Fix MetalRHI memory reporting in non-default path.
PR #4568
Change 3988184 by Arciel.Rekman
Linux: Fix editor OpenGL performance (UE-55960).
- GetCurrentThreadId() calls became much more frequent with the OpenGL RHIT refactor.
- We used to only cache that value in monolithic builds, because having per-thread static variables in dynamic libraries is risky due to OS limits.
- This change adds dynamically-managed per-thread cache for non-monolithic builds.
#jira UE-55960
Change 3988394 by Rolando.Caloca
DR - vk - Improve memory mgmt
- Use 256MB pages for Device heap (or 1/8th if less).
- Remove texture allocations not going through resource manager
Change 3988405 by Marcin.Undak
Fix VulkanQuery crash on exit #codereview rolando.caloca #codereview arciel.rekman #rb arciel.rekman
Change 3988567 by Rolando.Caloca
DR - vk - Support for packed global UBs on pci aperture heap
Change 3988668 by Rolando.Caloca
DR - vk - Remove old comments
Change 3988956 by Marcin.Undak
RecordPerformance: added option to skip building/cooking before tests #rb none #codereview arciel.rekman
Change 3989161 by Yuriy.ODonnell
Static analysis error fix
Change 3989196 by Guillaume.Abadie
Fixes a crash in light shaft's TAA pass.
#jira UE-57366
Change 3989207 by Yuriy.ODonnell
Refactored FRealtimeGPUProfilerFrame to avoid splitting profile events when calculating exclusive times of scopes. This allows tracing profiler to retain the hierarchical view of the data, while keeping CSV and GPU Stat system behavior intact.
Change 3989469 by Rolando.Caloca
DR - vk - Fix for bad index; fix for bad transition
Change 3989772 by Yuriy.ODonnell
Implemented timestamp calibration on Vulkan
Change 3990040 by Marcus.Wassmer
Aftermath enabled by default.
Removed unnecessary warning for other vendors
Change 3990064 by Mark.Satterthwaite
Ensure that packed globals are reuploaded when the command-encoder is restarted - don't simply invalidate the existing parameters. This properly handles cases where a single logical render-pass is broken into multiple command-encoders and/or command-buffers - otherwise all shaders must reset all parameters each time. When we move between frames we *do* want to perform a full state reset though as previous frame globals are treated as invalid.
Change 3990080 by Mark.Satterthwaite
Change the way we invalidate the visibility buffer between command-buffers and command-encoders so that on iOS you can reuse the same buffer within the same command-buffer, but not across more than one. The code provides an exception to this rule when running under the MetalRHI validation tools which can break each draw call into its own buffer.
Change 3990084 by Mark.Satterthwaite
Get MetalStatistics compiling again.
Change 3990381 by Arciel.Rekman
Bring back D3D12 in RecordPerformance.
Change 3991113 by Rolando.Caloca
DR - Fix crash on RHI thread on mobile preview
- Check RHI objects are not null in the PSO initializer
Change 3991191 by Ryan.Vance
#jira UE-55952
Reimplemented instanced stereo for forward lighting cull grid after the srv/ub clean up.
Change 3991343 by Rolando.Caloca
DR - Copy from 3911492
UE4 - Disabled parallel mobile bass pass by default. This is experiemental and not known to be useful on any mobile platform.
Change 3991375 by Mark.Satterthwaite
Proper copyright assignment in the mtlpp debugger header.
Change 3993151 by Daniel.Wright
Fix RTDF resource transition found by Rolando
Change 3993818 by Rolando.Caloca
DR - Missed file
Change 3993923 by Krzysztof.Narkowicz
Fixed crashes inside RemoveSpeedTreeWind() and RemoveSpeedTreeWind_RenderThread().
FStaticMeshComponentRecreateRenderStateContext didn't flush deferred render updates causing stale RenderData to be left:
1. Thumbnail manager called SetStaticMesh(nullptr), which added StaticMeshComponent to deferred render updates.
2. UStaticMesh::Build called FStaticMeshComponentRecreateRenderStateContext and destroyed DenderData, but didn't touch Thumbnail's manager StaticMeshComponent as it was nullptr.
3. This resulted in a StaticMeshComponent with stale RenderData pointer.
#jira UE-54544
Change 3994033 by Rolando.Caloca
DR - vk - Reworked layers & extensions, as we were not doing it properly
- Remove -vulkanstandardvalidation and -novulkanstandardvalidation as they are not needed anymore
Change 3994275 by Mark.Satterthwaite
Change to linking against mtlpp via AddEngineThirdPartyPrivateStaticDependencies and marking its header with THIRD_PARTY_* macros in the vain hope that might convince the remote compilation code to distribute the module to the remote machine when building MetalRHI.
#jira UE-57507
Change 3994365 by Mark.Satterthwaite
Pilfer some code from the old MetalHeap file to handle calculating texture memory size on older macOS and iOS builds when running with stats or LLM enabled.
#jira UE-57513
Change 3994382 by Rolando.Caloca
DR - vk - Some missing locks during image tracking
Change 3994422 by Rolando.Caloca
DR - vk - Remove bogus shader format
Change 3995530 by Rolando.Caloca
DR - vk - Fix for crash when validation is enabled
Change 3995531 by Rolando.Caloca
DR - vk - Fix static analysis
Change 3995532 by Rolando.Caloca
DR - vk - Added support for r.Vulkan.SaveValidationCache
Change 3995610 by Uriel.Doyon
Texture Streaming Changes and Fixes:
- Using the small FOV items (like scopes) now only affect visible primitives (through "r.Streaming.MaxHiddenPrimitiveViewBoost").
- Static components added after the level is registered in the streaming manager are now handled correctly (fixes the low quality on the chests)
- Dynamic components do not need to register to the streaming manager anymore.
- Optimized dynamic component management by removing duplicate entries in the update list.
- Added a pregarbage collect pass to the dynamic component management to optimize GC handling.
- Added a budget reset logic whenever the scene requirements change significantly.
- PIE worlds now have correct visibility information.
- Fixed possible invalid memory access when processing the streaming manager slave views.
- Refactored the incremental level texture data build to prevent new components from being unhandled.
- Removed StreamingManager callbacks for NotifyActorSpawned() and NotifyPrimitiveAttached()
- Added a StreamingManager callback NotifyPrimitiveUpdated(), to be used whenever a primitive streaming state must be updated.
#jira none
Change 3995908 by Arciel.Rekman
Fix compile errors when using new Vulkan queries.
Change 3995990 by Arciel.Rekman
More compile fixes to new Vulkan queries.
- MSVC did not catch this, clang did.
Change 3996101 by Rolando.Caloca
DR - vk - Win32 compile fix
Change 3996323 by Mark.Satterthwaite
Use the right include path to export the mtlpp headers.
#jira UE-57507
Change 3996392 by Arciel.Rekman
Vulkan: fix crash on start when using new queries.
- CommandBufferManager was not yet set at that point and the code in queries relied on it.
Change 3996585 by Rolando.Caloca
DR - Slight improvement to GL being black, but just a temporary 'workaround' as it's not correct.
Change 3998806 by Arciel.Rekman
Fix Linux build (UE-57602).
#jira UE-57602
Change 3998866 by Arciel.Rekman
SubwaySequencer: fix old shader platform name.
Change 3998947 by Mark.Satterthwaite
Silence deprecation warnings in CEF on macOS now that we've moved to 10.12 as the minimum.
#jira UE-57577
Change 3998951 by Mark.Satterthwaite
Fix last of the deprecation errors that I am aware of for macOS 10.12.
#jira UE-57581
Change 3998984 by Mark.Satterthwaite
Build mtlpp for iOS 9.0 not 9.3.
#jira UE-57586
Change 3999065 by Rolando.Caloca
DR - vk - Make sure we use version 1.0.0
#jira UE-57521
Change 3999071 by Arne.Schober
DR - [UE-55433, UE-57361] Hack SNORM support in OpenGL by re-interpreting UNORM. Underlying data is always SNORM.
#jira UE-55433, UE-57361
Change 3999494 by Rolando.Caloca
DR - Enable r.UnbindResourcesBetweenDrawsInDX11 in debug
- Clear compute resources when r.UnbindResourcesBetweenDrawsInDX11 is enabled
Change 4000197 by Krzysztof.Narkowicz
Mesh simplifier - normalize TexCoordWeights using min/max TexCoord range. This fixes precision issues for very big TexCoord values and allows to optimize for all TexCoord channels when channels have values of different magnitudes (e.g. non standard TexCoord data).
#jira UE-54935
Change 4000305 by Yuriy.ODonnell
Suppress PVS Studio warning V547 (Expression is always true) related to Aftermath
Reported issue to PVS team and to NVIDIA. Confirmed false positive, fix coming in future PVS version (v6.24).
#jira UE-57579
Change 4000853 by Arciel.Rekman
Linux: fix not calling CrashReportClient (UE-57678).
#jira UE-57678
Change 4001504 by Rolando.Caloca
DR - vk - Fix transition
Change 4002460 by Krzysztof.Narkowicz
Toggle for contant shadow length in word space
Exposed contact shadows to Blueprints
#jira none
Change 4002608 by Rolando.Caloca
DR - vk - Fix static analysis
- Fix potential debug image tracking crash
- Comment out unused methods
Change 4002615 by Rolando.Caloca
DR - vk - Allow r.Vulkan.WaitForIdleOnSubmit to be set at startup (e.g. in ConsoleVariables.ini)
Previously, if your map needed to UpdateSkyCaptureContents on startup, an ensure would fail if GWaitForIdleOnSubmit was set.
PrepareForCPURead needs to wait for the command buffer to finish before trying to read the results back, but the wait has already happened when r.Vulkan.WaitForIdleOnSubmit is set. Trying to wait again correctly complains that the command buffer is not in the correct state. So, skip the WaitForCmdBuffer call when r.Vulkan.WaitForIdleOnSubmit is set.
Change 4002640 by Rolando.Caloca
DR - vk - Missing support for CVarDefaultBackBufferPixelFormat
Change 4002919 by Guillaume.Abadie
Implements DOF's temporal upsampling pass for better dynamic resolution stability.
Change 4002984 by Guillaume.Abadie
Integrates Sebastian Aaltonen's ALU optimisations for TAAU.
Change 4003112 by Olaf.Piesche
Fir for TBB stall (resulting in severe hitches and hangs in the editor with stats active); tested multiple scenarios and encountered no hitches.
#tests QAGame PerformanceTest and RenderTest map with various stats on and off
Change 4003159 by Mark.Satterthwaite
Undo parts of changelist 3970553 - the ref-counted pointer approach to returning textures to the pool is not working as expected so we'll remove that. It'll be faster on the CPU without it and everything works thanks to the changes this CL made to the way textures were released.
#jira UE-57538
Change 4003287 by zachary.wilson
Adding reflection capture content to TM-LightingScenarios
Change 4003395 by Arne.Schober
DR - Fix unitzialised value when clicking Go To in the editor
#jira UE-57048
Change 4003425 by Rolando.Caloca
DR - vk - Fix for new occlusion queries
Change 4003530 by Arne.Schober
DR - Disable GPU Benchmark in headless configurations
#jira UE-57673
Change 4003717 by Rolando.Caloca
DR - vk - Fix for depth not store, stencil store
Change 4003719 by Rolando.Caloca
DR - Minor switch to render pass
Change 4003720 by Mark.Satterthwaite
Don't suballocate private memory buffers on Vega and only Vega as there is something wrong with the blits in those cases but I can't capture a GPU trace to find out what right now (the driver is broken) - could be a bug in my code but this works on Polaris and Nvidia so it will need to be filed as a radar for AMD.
Remove the FMetalBufferChunk from FMetalBuffer and simply store a pointer to the owning Heap/Magazine allocator. The FMetalResourceHeap now calls a new Release function to return the buffer to the allocator which will be faster on the CPU.
#jira UE-57659
Change 4003854 by Mark.Satterthwaite
Undo parts of 3990064 and try a different approach to get the uniforms to upload and remain available in the right places. As the original bug has been lost to time we should keep an eye out for missing buffer bindings by running under the Metal validation layer periodically.
#jira UE-57576
Change 4004709 by Rolando.Caloca
DR - Support for D3D 11, 12 & Vulkan for UAVs off Index Buffers
Change 4005149 by Guillaume.Abadie
Adds shader permutation to avoid clamping input buffer UV in DOF's gather pass.
Change 4005284 by Uriel.Doyon
Resaved volume texture assets with proper engine version.
#jira UE-57534
Change 4005286 by Guillaume.Abadie
Reduces constant setup in DOF's gather pass.
Change 4005359 by Rolando.Caloca
DR - vk - Fix annoying warning
Change 4005363 by Rolando.Caloca
DR - Fix android not finding vulkan shaders
Change 4005457 by Rolando.Caloca
DR - vk - Fix swapchain crash
Change 4005473 by Patrick.Kelly
UE-57135: Editor crash if set Reflection Capture Resolution to be 64 and New a Default level
Codde by Daniel
Tested by Patrick
Change 4005474 by Rolando.Caloca
DR - vk - Remove glsl code from shaders. Packaged QAGame goes from 176MB to 162MB
Change 4005759 by Krzysztof.Narkowicz
Fixed a bug, where reflection capture build is called, even though we are in mobile preview mode.
#jira UE-57743
Change 4005774 by Mark.Satterthwaite
Update the wave intrinsics to avoid implicit bool->uint conversion that Apple don't like.
#jira UE-57750
Change 4005974 by Mark.Satterthwaite
Don't use cubemap array types on iOS Metal as they aren't available on all devices and we need to maintain backward compatibiliy for years to come.
#jira UE-57083
Change 4006056 by Mark.Satterthwaite
Remove the use of the PrimitiveType argument from Metal draw calls.
#jira UE-57822
Change 4006139 by Mark.Satterthwaite
- Move the render-pass functions into the MetalRHI implementation for later alteration.
- Implement Index buffer UAVs for Metal - makes them more like vertex-buffers so this is one more step on the road to a unified buffer base-class implementation.
Change 4006215 by Mark.Satterthwaite
Metal's begin & end render/compute pass API implementation will take some time, but for now make it not depend on the parent stub implementation.
Change 4006394 by Mark.Satterthwaite
In lieu of a real instruction count just use the number of lines in the "Main" function of the shader as the instruction count for Metal.
#jira UE-57551
Change 4006493 by Mark.Satterthwaite
MetalRHI can currently support 4-component formats for Buffer UAVs - this might need some thought in the future as the API evolves but we might as well take advantage while we can.
Change 4006495 by Daniel.Wright
Integrate from Refactor branch
* New FMaterialRenderProxy function GetMaterialWithFallback which provides both the FMaterialRenderProxy and FMaterial. Needed when falling back to default material, so that proxy and material resource match.
* Local vertex factory uniform buffer
Change 4006851 by Brian.Karis
Fix for joined charts forming an L to inflate both axii.
Thanks to Jess Kube of The Coalition.
Change 4006852 by Brian.Karis
Fix for hard coded reflection capture cube map size. Should fix light static light aliasing in captures
Change 4006918 by Brian.Karis
New ByteBuffer functionality. Memcpy and scatter upload. Can implement GPU side TArray reflection.
Not yet used by checked in code. WIP optimization.
Change 4007246 by Guillaume.Abadie
Creates lower quality permutation for DOF's gathering pass, without Coc based weighting of the samples, and lower number of gathering ring for fast accumulator.
Change 4007291 by Guillaume.Abadie
Exposes more DOF scalability settings.
Change 4007328 by Guillaume.Abadie
Optimises DOF's half res only setup pass using gather4
Change 4007627 by Richard.Wallis
Fix for when Magic Mouse cannot zoom in World Composition editor. Missing default SNodePanel::OnMouseMove behaviour. Tested using a classic 2xbutton + wheel mouse and a Mac MagicMouse.
#jira UE-57030
Change 4007682 by Richard.Wallis
No video when playing HLS streaming video on Mac. 2 Issues, FPS was zero making duration for video sample buffer nonsense and Video Track dimensions were going to zero on the AVAsset once fully initialized when playing HSL streams. Now cache relevant details and handle zero frame rate.
Notes:
- Caching the frame rate is not as important as we could look it up each time and fix for zero - ignoring that at the moment.
- Assume we DO NOT want the FrameSize to be the last fetched video frame size from the AvfMediaVideoSampler as I think that is the video quality for streaming video and not the media frame size.
- Renamed a variable in the AvfMediaVideoSample - was called FrameRate but it was the FrameDuration by that point.
#jira UE-56734
Change 4007731 by Rolando.Caloca
DR - Disable byte buffers on non-hlsl based platforms
#jira UE-57851
Change 4007741 by Rolando.Caloca
DR - Disable byte buffers on hlslcc platforms
Change 4007782 by Mark.Satterthwaite
Force Metal shaders, including the stdlib, to recompile.
Change 4007918 by Rolando.Caloca
DR - vk - Some static asserts
Change 4008404 by Arciel.Rekman
Do not crash on incompatible Vulkan drivers (UE-57521).
#jira UE-57521
Change 4008442 by Daniel.Wright
Better comments on ERHIFeatureLevel expectations
Change 4008494 by Arne.Schober
DR - moved bDeletedThroughDeferredCleanup before begincleanup to catch cases where the reference is added twice to the array. also removed finishcleanup as all they ever did was deleting the pointer anyway, and it sould be adfded if such functionallity is ever required fom outside of the regular destructor.
#jira UE-57754
Change 4008730 by Mark.Satterthwaite
After the most recent changes to handling uniform buffer dirty bits in MetalRHI we should guard against attempts to set an unbound uniform buffer.
#jira UE-57870
Change 4008949 by Brian.Karis
Fix compile warning
Change 4008951 by Brian.Karis
Added LTC LUT textures
Change 4009326 by Guillaume.Abadie
Compiles out DOF's gathering bokeh simulation on platform other than desktop.
Change 4009380 by Krzysztof.Narkowicz
Moved area light code before the contact shadows, so contact shadows use representative light's direction.
Merged all contact shadows shader code.
Contact shadows keep constant screen space length independent of FoV settings.
Contact shadows for translucents.
Contact shadows for eye.
Change 4009555 by Guillaume.Abadie
Splits DOFCocTile.usf in two.
Change 4009999 by Yuriy.ODonnell
MallocStomp can now be enabled on certain platforms using '-stompmalloc' command line argument.
Previously it was necessary to modify MallocaStomp.h and re-compile the engine.
Currently supported platforms: Win64, Mac, Linux.
Replaced hard-coded page size with FPlatformMemory::GetConstants().PageSize.
Change 4010288 by Rolando.Caloca
DR - vk - Fix for vertex streams
Change 4010289 by Krzysztof.Narkowicz
D3D12 - fixed depth bounds bug, where depth bounds wasn't properly set to [0;1] after disabling.
#jira UE-57510
Change 4010297 by Rolando.Caloca
DR - vk - Remove some functions for android
Change 4010315 by Rolando.Caloca
DR - vk - Remove create info macro
Change 4010451 by Rolando.Caloca
DR - vk - Reuse samplers
- Infiltrator goes from 5759 to 24 samplers!
Change 4010627 by Rolando.Caloca
DR - vk - Fix missing values for tracking swapchain validation
Change 4011924 by Guillaume.Abadie
Implements tile based early return optimisation on DOF's postfiltering method.
Change 4011941 by Guillaume.Abadie
Shaves some ALU in DOF's accumulator for LowQuality permutation.
Change 4012093 by Yuriy.ODonnell
Disable MallocStompOverrunTest() in static analysis config, as it intentionally performs an out-of-bounds access.
Change 4012195 by Rolando.Caloca
DR - vk - Fix for mobile backbuffer layout
Change 4012202 by Rolando.Caloca
DR - vk - Don't use staging buffers on UMA
Change 4012467 by Rolando.Caloca
DR - Remove redundant check
Change 4012486 by Rolando.Caloca
DR - Fix missing transition
Change 4012518 by Guillaume.Abadie
Implements fast shader permutation for DOF's TAA pass.
Change 4013084 by Arciel.Rekman
Fix for Linux clock discrepancy.
- Causing at least one precision issue, possibly more.
(Edigrating 4003273, 4012462 from //UE4/Dev-Editor/... to //UE4/Dev-Rendering/...)
Change 4013266 by Uriel.Doyon
Fixed crash when setting SceneDepthTextureNonMS and not having valid depth buffers in the SceneContext.
Change 4013626 by Uriel.Doyon
Fixed crash in the lighting build when creating a blueprint of the ALight and placing a light component in it.
#jira UE-51672
Change 4013805 by Rolando.Caloca
DR - Fix more missing transitions
Change 4014128 by Arne.Schober
DR - Do not create LocalVFUniformBuffer when running without MVF
#jira UE-57929
Change 4014193 by Uriel.Doyon
Editing component transforms now invalidate the component's lighting cache.
#jira UE-48134
Change 4014282 by Rolando.Caloca
DR - vk - Remove extra validation during dump
Change 4014584 by Uriel.Doyon
Duplicated static meshes now generate a new GUID to prevent possible issues with lightmass.
#jira UE-49064
Change 4014604 by Uriel.Doyon
UStaticMesh postduplicate now only generates a new GUID if !bDuplicateForPIE.
Change 4015460 by Guillaume.Abadie
Composes separate translucency within DOF's recombine pass.
Change 4015571 by Guillaume.Abadie
Refactors tonemapper to use global shader permutation API, that adds permutation for HDR output device rather than dynamic branching that some shader compiler are not very well optimizing.
Change 4015984 by Krzysztof.Narkowicz
Fixed crash inside DFAO resource allocation, when DFAO viewport has zero area.
#jira UE-58000
Change 4016056 by Mark.Satterthwaite
Fix Mac Metal shader compilation of texture cube arrays.
Change 4016062 by Richard.Wallis
Convert things like Space, Delete, F6 etc to unicode so they display correctly on the Mac menu rather than first letter of word. Added the default Mac commands to the GenericCommands so we get a Chord overwrite message and stop things like cmd+ q / w / h from getting bound.
#jira UE-46999
Change 4016109 by Mark.Satterthwaite
One unified Metal buffer implementation - will make further changes a heck of a lot easier.
Change 4016221 by Patrick.Kelly
UE-57617: Ensure changing viewmode to ShaderComplexity while in -game
Change 4016238 by Guillaume.Abadie
Makes clang happy again in Tonemapper.
Change 4016309 by Mark.Satterthwaite
More *_RenderThread implementations for MetalRHI.
Change 4016414 by Mark.Satterthwaite
And MetalRHI version of CreateStructuredBuffer_RenderThread...
Change 4016498 by Mark.Satterthwaite
Don't hold on to the uniform buffers bound to the hull shader when switching to a tessellated draw call as they'll have the wrong buffer layout.
#jira UE-57930
Change 4017394 by Juan.Canada
OpenGL: Fixed shading artifacts due incorrect UNORM/SNORM conversions in skin/skincache/computetangent shaderss.
#jira UE-57691
Change 4017522 by Rolando.Caloca
DR - vk - Remove unused code path (old mip generation detection)
Change 4017539 by Rolando.Caloca
DR - vk - Fix for sky lighting mips showing green on AMD
Change 4017542 by Arciel.Rekman
Moved appCountTrailingZeros to a non-SSE header (fixes ARM64 build).
- Arguably WITH_SLI shouldn't apply to Linux on ARM but the fact that the function wasn't available is bad on its own.
Change 4017827 by Guillaume.Abadie
Optimises DOF's scattering cost by a third.
Change 4017835 by Rolando.Caloca
DR - Only allow a render pass to generate mips for one color render target
Change 4017889 by Mark.Satterthwaite
Cache all the Metal state objects to avoid hitting the API unnecessarily.
Change 4018251 by Mark.Satterthwaite
Fix broken rendering on Metal that tracked back to the innocuous looking changes in CL #4006495 (no blame attached - these changes are entirely reasonable) and cause various bugs in QAGame's TM-DistanceFields, ElementalDemo and probably more. Doesn't fix broken SpeedTree rendering :(.
MetalRHI was allowing uniform buffers to blow away linear texture buffers when the constant buffer has been elided due to dead-code elimination. This problem can manifest without linear textures if the uniform buffer contains both constant data and a resource-table but the shader doesn't use any of the constant data. That's because Metal doesn't separate constant buffers from any other kind of buffer unlike D3D which separates all the slots out - and Metal doesn't provide enough buffers to emulate the D3D arrangement. So far this has only manifested in the MVF + Linear Texture case but a more robust solution will be necessary long term.
Change 4018514 by Guillaume.Abadie
Implements r.DOF.Scatter.MinCocRadius.
Change 4018553 by Guillaume.Abadie
Implements r.DOF.Scatter.MaxSpriteRatio to control the budget upperbound of DOF's scattering
Change 4020369 by Yuriy.ODonnell
Disable MallocStompOverrunTest in all static analysis configs (using USING_CODE_ANALYSIS macro)
Previously was only disabled for PVS-Studio.
Change 4020620 by Arciel.Rekman
Fix XboxOne CIS (fallout of appCountTrailingZeros move).
Change 4020949 by Guillaume.Abadie
Configures DOF in scalability settings.
Change 4021593 by Rolando.Caloca
DR - vk - Support for Aftermath style api on AMD
Change 4021740 by Rolando.Caloca
DR - vk - Change log output
Change 4022008 by Uriel.Doyon
Fixed renderthread stalls when streaming texture mips on low end systems.
Change 4022135 by Rolando.Caloca
DR - vk - Fix last mip's layout during mip chain creation
Change 4022607 by Jian.Ru
Speculative fix for a bug where an invalid vertex buffer is deferenced
#jira UE-56229
Change 4022890 by Rolando.Caloca
DR - Fix reference count not getting released
Change 4023540 by Mark.Satterthwaite
Avoid some pointless retain/release calls on Metal Encoders.
Change 4023796 by Marcus.Wassmer
Tell users they are over the maximum size when allocating very large rendertargets.
Change 4025337 by Yuriy.ODonnell
Improved use-after-free detection mechanism and physical memory usage of MallocStomp on Windows.
MallocStomp on Windows will now reserve virtual address space for every allocation and then commit physical pages only to the valid usable part.
Physical pages will be unmapped on Free, but virtual address space will not be released and therefore will never be re-used.
Virtual address space is allocated from the OS in blocks of 1GB and then linearly sub-allocated.
This reduces VA space usage, as VirtualAlloc returns blocks on 64KB granularity even if we just need 4KB. As a small bonus, this also reduces number of syscalls per allocation.
This dramatically increases accuracy of use-after-free detection, but consumes significant amount of memory for the OS page table.
Virtual memory limit for a process on Win10 is 128 TB, which means we can afford to keep virtual memory reserved for a long time.
Running Infiltrator demo consumes ~700MB of virtual address space per second.
Additionally, committing physical pages only for the usable part of the entire virtual block reduces physical memory usage by ~30% compared to old behavior,
which allocated and committed entire block of pages via BinnedAllocFromOS and then marks border page as non-accessible.
Change 4026047 by Rolando.Caloca
DR - Fix test/shipping
#jira UE-58148
Change 4026150 by Krzysztof.Narkowicz
Force proper ordering of buffer visualization materials - after tonemapping (so exposure doesn't influence it) and before editor stuff like icons.
#jira UE-57992
Change 4026226 by Rolando.Caloca
DR - Fix static analysis
#jira UE-58150
Change 4026354 by Jian.Ru
Debug check trying to catch a crash. Only enabled in editor build
#jira UE-50111
Change 4026655 by Rolando.Caloca
DR - Fix for static analysis
#jira UE-58149
Change 4026763 by Rolando.Caloca
DR - Remove references to defunct CCT to avoid confusing licensees
Change 4027167 by Uriel.Doyon
Fixed possible out of bound buffer access when serializing with FDuplicateDataWriter.
#jira UE-56509
Change 4027850 by Jian.Ru
Prevent log spam
#jira UE-50111
Change 4029546 by Rolando.Caloca
DR - Compile fixes
Change 4029624 by Yuriy.ODonnell
Addressed static analysis errors in MallocStomp
- VirtualAlloc return value is now explicitly checked.
- C6250 is suppressed, as VirtualFree does not release address space by design.
Change 4030225 by Yuriy.ODonnell
Static analysis warning fix: make sure declaration of Sleep() is consistent between Windows headers and TBB
The complexity with this particular case is that the warning is generated in synchapi.h, which is included by some Windows headers.
If a module includes TBB and then Windows platform headers, static analyzer will report this warning.
Suppressing it would require wrapping all instances of Windows header includes in third-party macros.
Current pragmatic solution is to modify the Sleep() declaration in TBB header to be consistent with Windows and to report the issue to Intel for a permanent fix.
Change 4030440 by Rolando.Caloca
DR - Fix crash on mobile
#jira UE-58222
Change 4030570 by Daniel.Wright
Allow null SRV's in uniform buffers for feature levels that don't support SRV's in shaders
Change 4030618 by Arne.Schober
DR - missing tangent/normal sign conversion after integration from main
#jira UE-58224
Change 4031588 by Rolando.Caloca
DR - vk - Fix compile error when missing vkCmdWriteBufferMarkerAMD
Change 4032145 by Mark.Satterthwaite
Fix UE-58268 by only emitting the base_instance/base_vertex variables required to fix-up the instance/vertex ID values to match D3D when the Metal version is 1.1 or higher, earlier versions don't support these features.
#jira UE-58268
Change 4032209 by Rolando.Caloca
DR - Fix crash on EngineTest: Mesh Batch's UserIndex is not a union anymore
Change 4033178 by Guillaume.Abadie
Fixes FXAA sampling outside viewports, that was causing black outline on bottom and right edge of the screen when ViewSize != BufferSize, problematic for some screenshot automated test.
#jira UE-58151
Change 4034489 by Daniel.Wright
Fixed UStaticMeshComponent modifying its UStaticMesh when undoing a change. This caused a crash when other static mesh components using the same mesh asset were rendered, since their rendering state was not recreated. A component should not modify its asset during PostEditUndo.
* This behavior has been present for a long time but was previously hidden because only the vertex factory of the mesh asset is cached in static draw lists, not any of its rendering resources (eg vertex declaration).
Change 4035157 by Uriel.Doyon
Fixed deadlock in the streaming code when running with -onethread.
#jira UE-58299
Change 4035198 by Rolando.Caloca
DR - vk - Fix issue when an older SDK was installed, UBT would pick it (should pick the newer of ThirdParty\Vulkan or installed SDK).
#jira UE-58267
Change 4035730 by Arne.Schober
DR - Fix missing Fog parameters during LightScattering Injection
#jira UE-57608
Change 4035843 by Daniel.Wright
Reimplemented support for EyeAdaptation node in opaque materials
Change 4036837 by Marcus.Wassmer
Replace some of the screenshots to match new un-tonemapped buffer visualization
Change 4036980 by Rolando.Caloca
DR - vk - Fix deadlock contention during mem allocation on Linux
Change 4037225 by Guillaume.Abadie
Fixes jittering selection outline.
#jira UE-58350
Change 4038056 by Marcus.Wassmer
roll back changelist 4026150. breaks a bunch of automated tests by cutting off half the image.
Change can go back in later with that part fixed also
Change 4038296 by Jian.Ru
Static analysis fix
#jira UE-58377
Change 4038402 by Ben.Marsh
Suppress IncludeTool warnings caused by CL 3998947.
Change 4038514 by Arne.Schober
DR - Fix case with MVF where instance offset is not supported by the API (in this case only foliage OpenGL and TvOS), usually the buffers are offsetted instead but with MVF we do not use offsetted buffers, therfore the offset needs to be passed into the shader although we are drawing with offset of 0.
#jira UE-57652
Change 4038747 by Marcus.Wassmer
Back out changelist 3853645, causing us to lose shadows in the shaderhair test
Change 4040138 by Rolando.Caloca
DR - Fix compile warning
Change 4041614 by Rolando.Caloca
DR - vk - Fix for Oculus module
#jira UE-58267
Change 3810277 by Daniel.Wright
Ray Traced Distance Field shadows use a two pass tile culling algorithm with no tile max - fixes flickering from tile overflow in dense areas or with a low sun angle. Costs .2ms on PS4.
The distance field scene buffers now use float4 on PS4 and Xbox, saves .1ms on PS4.
Change 3817029 by Uriel.Doyon
Added UVolumeTexture, which use 3D textures. Compressed formats are supported on DX11, DX12, PS4 and XB1.
Projects targetting OpengGL don't have access to compressed formats (as the implementation has texture tiling issues).
Add "r.AllowVolumeTextureAssetCreation" set as 0 by default, which controls whether volume texture can be sampled in materials and whether they can be created from 2D texture assets.
Platform not supporting BC7, will now fallback on RGBA8 instead of DXT to preserve quality, in an attemps to increase usage of BC7.
#jira UE-32263
Change 3819960 by Michael.Lentine
Expose UEPhysics Clothing Parameters through UI.
Change 3823401 by Rolando.Caloca
DR - Add NumQueriesInBatch to RHIBeginOcclusionQueryBatch
Change 3844805 by Arne.Schober
DR - Increased Intermediate normal of Umodel and Skelmesh from 8bit Unorm Compressed to float. A resave/rebuid/reimport of the meshes is recommended to recover some lost precision.
Fixed an issue with compressed (packed) normals on the GPU which were off by one integer representation. Also switched from UNORM to SNORM to get a discrete zero representation and removed some mads from all the VertexShaders.
Change 3847283 by Marcus.Wassmer
Extra fixes from Uriel
Change 3876607 by Rolando.Caloca
DR - Use render passes when running occlusion queries
- Removes the RHI(Begin|End)OcclusionQueryBatch API
Change 3903799 by Daniel.Wright
[Integrate] Pass Uniform Buffers
* All pass-constant shader inputs should go into the appropriate pass uniform buffer, instead of being set per-draw
* Moved many per-draw base pass parameters over to the Base Pass Uniform Buffer
* Opaque and Translucent base pass shaders have different uniform buffers, which allows compile errors when accessing an invalid resource (eg GBuffer in Opaque), instead of silently falling back to GBlackTexture
Uniform buffers can now contain nested structs with UNIFORM_MEMBER_STRUCT()
* This allows composing a uniform buffer at a particular update frequency out of many features, with encapsulation of each feature's parameters in a struct.
* Eg deferred fog uses FFogUniformParameters, but so does translucency in the base pass, where FFogUniformParameters is reused nested inside the base pass uniform buffer.
* Resources can now be located anywhere in the uniform buffer. Padding is inserted to the cbuffer representation to keep memory layouts matching. In the future the cbuffer could be compacted.
* RemoveUniformBuffersFromSource() which works around HLSLCC lack of struct initializers now handles nested structs
Change 3917500 by Rolando.Caloca
DR - Change depth bounds so only the enable bit is in the PSO, allow min/max to be dynamically modified
Change 3964907 by Guillaume.Abadie
Implements RectList topology support in RHI.
Change 3979171 by Mark.Satterthwaite
Copying //Tasks/UE4/Dev-UERNDR-354-mtlpp to Dev-Rendering (//UE4/Dev-Rendering):
Rewrites MetalRHI in terms of mtlpp, which is a C++ wrapper library built around Metal's Objective-C API that attempts to reduce overheads and eliminate resource lifetime errors.
Regarding mtlpp:
- The mtlpp library uses C++ constructor/destructor and smart-pointer style management of Objective-C retain/release calls to prevent over- and under-release problems.
- To reduce Objective-C overheads the mtlpp library caches the internal C-function that implements the Objective-C selectors for the most commonly used Metal protocol types and calls the function directly - this avoids objc_msgSend which does this look-up dynamically and thus improves CPU performance slightly.
- Another advantage is that mtlpp provides infrastructure to extend the Metal API slightly to help improve MetalRHI - the two important aspects are mtlpp::CommandBufferFence which provides a consistent CPU<->GPU synchronisation primitive and sub-buffer allocations from mtlpp::Buffer which allow for far superior memory management.
- Validation functionality is also provided by mtlpp to detect CPU vs. GPU data races and resource lifetime validation - this is expensive and is thus optional and compiled out from Shipping binaries that should be used when performance is most critical. The validation only works between resource modification and *submitted* command-buffers - anything that is being actively encoded on the CPU is ignored and it remains the responsibility of the application to validate the order of operations when encoding.
Apple Platform:
- LLM support which tracks Objective-C objects is enabled only on macOS - we don't have the necessary libraries to intercept and override the internal system calls on iOS.
MetalRHI:
- All the types are switched over, (mostly) insuling the external API from the horror of Metal and Objective-C.
- Buffers are now managed quite differently, small buffers are allocated from a magazine allocator that allocates in fixed blocks from a larger parent buffer, intermediate sized buffers are allocated from a simple heap allocator that wraps a larger buffer and anything of reasonable size (>2Mb) will use the pooled allocator. This *radically* reduces the number of buffer resources, by as much as a factor of 10, because they are now sub-allocated without the need to use MTLHeap or MTLFence so they are performance equivalent to the existing implementation on the GPU and much faster on the CPU. Total memory use is approximately the same.
- Vertex & index buffer management has been updated to reflect changes in the management and to avoid reallocating buffers which provide a Linear Texture (for SRVs) unless strictly necessary. This ensures that even in cases where a dynamic buffer is updated multiple times in a frame it will still work acceptably well.
- The Metal ring-buffer implementation is completely different again, this time it can use Managed memory on macOS which allows for much better performance on eGPUs which will be more and more important for Mac.
- Everyone that needs to wait on a command-buffer fence (rather than a command-buffer itself) now use mtlpp::CommandBufferFence, which prevents race conditions between the different command-buffer handlers (which sometimes execute out of order).
- LLM tracking should now report the same data as the MetalRHI stats group for buffer & texture allocations - there is no segmentation for Vertex/index/Structured/Uniform allocations in Metal so these numbers are going to be wrong and will need to be rethought.
- What will be unseen are the number of small but important resource usage fixes that avoid stale resources from being bound to the device after the point at which they become invalid. This should eliminate a class of errors where the GPU uses a resource pointer that is modified by the CPU and was necessary to satisfy the new mtlpp validation code.
Other:
- Remove the Metal focused workarounds from the ClothBuffer resource binding and related vertex-buffer SRV - these were put in when MetalRHI/MetalShaderFormat couldn't handle float->uint conversions correctly and they should now.
- Fix a validation error caused by trying to render a 0-sized scissor rect which is invalid in Metal and simply pointless elsewhere.
- Consistency of disabling the Manual Vertex Fetch behaviour in shaders.
#jira UERNDR-354
Change 3979312 by Rolando.Caloca
DR - Remove bogus bKeepOriginalSurface parameter in CopyToResolveTarget
Change 4005122 by Rolando.Caloca
DR - Support for PS4 Index Buffer UAVs
Change 4016298 by Guillaume.Abadie
Fixes DOF hybrid scattering on platforms that supports RectList topology.
Change 4018575 by Guillaume.Abadie
Optimises DOF's reduce pass when doing scattering compilation.
Change 4020317 by Guillaume.Abadie
Implements WaveBroadcastIntrinsics.ush.
[CL 4042226 by Marcus Wassmer in Main branch]
2018-05-01 10:36:33 -04:00
ViewUniformShaderParameters . GlobalVolumeDimension = 0.0f ;
ViewUniformShaderParameters . GlobalVolumeTexelSize = 0.0f ;
2020-09-08 17:44:06 -04:00
ViewUniformShaderParameters . MaxGlobalDFAOConeDistance = 0.0f ;
2020-07-06 18:58:26 -04:00
ViewUniformShaderParameters . NumGlobalSDFClipmaps = 0 ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3091903)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3072947 on 2016/08/01 by Uriel.Doyon
Texture GUIDs are now included in cooked builds, as they are required by the texture streamer to link build data to in game texture.
#review-3072934 @marcus.wassmer
#jira UE-34045
Change 3073301 on 2016/08/02 by Ben.Woodhouse
Fix for large spotlight culling precision issues, reported on UDN by Aaron Jacobs at Double Fine. For a full description, see the UDN post
https://udn.unrealengine.com/questions/305440/shadowed-light-flicker-caused-by-floating-point-pr.html
#jira UE-34052
Change 3073689 on 2016/08/02 by Ben.Woodhouse
Improved skin postprocess - support for full resolution, with diffuse/spec lighting combined into single RGBA (sharing chroma)
Full res lighting gives less temporal AA flickering, sharper diffuse and specular lighting in the surface (since this is now at full resolution), faster postprocessing if using a 64-bit rendertarget (on NV 980Ti).
Checkerboard rendering is controlled via the r.sss.checkerboard cvar. - 0 is off/full res, 1 is checkerboard, 2 is automatic based on scenecolor (non-checkerboard requires 64bit or more rendertarget w/separate alpha)
Tested/profiled on PC, PS4
Change 3074666 on 2016/08/02 by Daniel.Wright
Fixed stationary skylight brightness
Change 3074667 on 2016/08/02 by Daniel.Wright
Fixed r.ReflectionEnvironmentLightmapMixing
Change 3074687 on 2016/08/02 by Daniel.Wright
Disallowed DrawMaterialToRenderTarget and Begin/EndDrawCanvasToRenderTarget in construction scripts, since they don't work in game. Blutilities can be used to do blueprint rendering in the editor.
Change 3075241 on 2016/08/03 by Rolando.Caloca
DR - Fix linux compile issue & static analysis warning
Change 3075746 on 2016/08/03 by Daniel.Wright
Removed bOverride_AntiAliasingMethod and outdated ini references to PP AntiAliasingMethod
Change 3075783 on 2016/08/03 by Ryan.Brucks
#code.review Marcus.Wassmer
Added two material nodes that return Atmospheric Light Vector and Light Direction using:
View.AtmosphericFogSunColor
View.AtmosphericFogSunDirection
Nodes are called:
AtmosphericLightVector
AtmosphericLightColor
Also changed SceneRendering.cpp so that values will be grabbed from directional lights without needing an Atmospheric Fog actor in the scene.
Change 3075969 on 2016/08/03 by Uriel.Doyon
Material GUIDs are not updated anymore when parents or textures change.
Lighting now uses a hash built from the list of parents, textures and shader functions.
#review-3072980 @marcus.wassmer @daniel.wright
Change 3076116 on 2016/08/03 by Ryan.Brucks
#code.review marcus.wassmer
Fixed typo in the Caption of new Nodes "Atmospheric Light Vector" and "Atmospheric Light Color"
Change 3076456 on 2016/08/03 by Rolando.Caloca
DR - Fix geometry shader gl_Layer for SPIR-V
Change 3076730 on 2016/08/03 by Uriel.Doyon
Added user warning logic for the texture streaming build. Ran in MapCheck, BeginPlay and PreSave.
#review-3072984 @marcus.wassmer
Change 3077616 on 2016/08/04 by Daniel.Wright
Planar reflection show flags can now be edited
Change 3077621 on 2016/08/04 by Daniel.Wright
Changed default Planar Reflection DistanceFromPlaneFadeoutEnd from 600 to 100, which reduces artifacts and is a more intuitive initial setting
Change 3077792 on 2016/08/04 by Daniel.Wright
Fixed an unnecessary sky capture caused by the sky light component owned by the default ASkyLight
Change 3077799 on 2016/08/04 by Daniel.Wright
Skip RF_ArchetypeObject for reflection captures
Change 3077876 on 2016/08/04 by Marc.Olano
Noise material perf improvements
Change random number generator for Gradient-ALU (1.7x perf boost), improve speed of Voronoi noise quality level 3.
Removes integer BBS random number generators. Fewer instructions, but too slow to use (see 1.7x perf boost above)
Change 3077884 on 2016/08/04 by Daniel.Wright
Lighting channels can now be edited on components with static mobility, since dynamic lights can still affect them
Change 3078994 on 2016/08/05 by Simon.Tovey
Fix for UE-34241
Scene proxy ptr was being cached during a downcast.
Inside a call to CreateDynamicData, CheckMaterialUsage_Concurrent() was causing the scene proxy to be recreated an so the cached ptr was stale.
I've fixed the immediate issue but recreating the scene proxy here doesn't seem great. Maybe CheckMaterailUsage() should be rethought a bit.
Change 3079162 on 2016/08/05 by Ben.Woodhouse
Fix for jittering in Paper2D. Was caused by override being ignored due to a change in intiialization order for AA settings.
#jira UE-34091
Change 3079613 on 2016/08/05 by Daniel.Wright
New blueprint function ClearRenderTarget2D, which is the only way to set a render target alpha directly
New blueprint function CreateRenderTarget2D
Change 3079708 on 2016/08/05 by Uriel.Doyon
Fixed crash when building texture streaming on some levels.
Change 3079795 on 2016/08/05 by Uriel.Doyon
Fixed issue with instanced static meshes when building texture streaming.
Fixed typo with func "GetNumTextureStreamingPrimitives"
Change 3079806 on 2016/08/05 by Uriel.Doyon
Enabled PerTexture MipBias. The per texture mip bias now resets to 0 when the texture gets required at low resolution.
New scalability setting named "r.Streaming.LimitPoolSizeToVRAM" enabling the PoolSize to be limited the available VRAM
(according to GPoolSizeVRAMPercentage)
#review-3074662 @marcus.wassmer
Change 3082698 on 2016/08/09 by Daniel.Wright
Copy - CreateRenderTarget2D uses a world context object as owner, allows use in a construction script
Change 3082699 on 2016/08/09 by Daniel.Wright
Changed display name for 'Two Sided' shading model to 'Two Sided Foliage' to make it clear what it's intended to be used for
Change 3083909 on 2016/08/10 by Olaf.Piesche
#jira UE-34106
#jira UE-32784
#jira UE-31198
Reset vertex factories on mesh emitters if mesh has been reimported (if mesh package is dirty)
Change 3084645 on 2016/08/10 by Olaf.Piesche
#jira UE-30398
Fix offset added to particle collision locations.
Change 3084709 on 2016/08/10 by Daniel.Wright
Copy - Scene capture alpha is now inverted to match DrawMaterialToRenderTarget, and to allow compositing with existing render target contents
Added CompositeMode to SceneCapture2D, which can be used to addively accumulate or composite instead of the default overwrite behavior
Added bCaptureOnMovement to SceneCapture, which can be disabled so the only source of scene capturing is a manual capture by calling CaptureScene()
Change 3084783 on 2016/08/10 by Rolando.Caloca
DR - Use the first targeted rhi shader platform as the initial RHI to load on Windows
#jira UE-34510
Change 3084958 on 2016/08/10 by Daniel.Wright
Copy - Reverted cl 2938543 "Lightmass now respects owner bHidden, and bCastHiddenShadow" because it did not have backwards compatibility so breaks content using hidden light cards
Change 3086023 on 2016/08/11 by Marcus.Wassmer
Merging //UE4/Dev-Main@3085468 to Dev-Rendering (//UE4/Dev-Rendering)
#test none
Change 3086778 on 2016/08/11 by Ben.Woodhouse
Workaround for fortnite character rendering issue. Enable checkerboard rendering by default until we can fix properly
#jira UE-34561
Change 3087404 on 2016/08/12 by Rolando.Caloca
DR - Upgrade glslang to 1.0.21.1
- Added some more debug output
Change 3087524 on 2016/08/12 by Rolando.Caloca
DR - vk - Fixed StencilRef, fixed size of RHIReadSurfaceFloatData (but still returns dummy data)
Change 3087663 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix for SRGB; support for mip texture views
Change 3087735 on 2016/08/12 by Daniel.Wright
TextureRenderTarget2D's can now be up to 8192^2. Anything over 2048 pops up an 'are you sure' dialog.
Change 3087750 on 2016/08/12 by Rolando.Caloca
DR - vk - Minor renaming in prep for merge
Change 3087813 on 2016/08/12 by Rolando.Caloca
DR - vk - More minor cleanup
Change 3087819 on 2016/08/12 by Chris.Bunner
Check material function input types directly, no need to traverse connected graph.
#jira UE-32134
Change 3087901 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix RT view to use 1 mip
Fix depth buffer component swizzle
Change 3088193 on 2016/08/12 by Daniel.Wright
DFAO and RTDF shadows are enabled in High and Epic scalability settings by default
Change 3088988 on 2016/08/15 by Rolando.Caloca
DR - Add Accessors
Change 3089104 on 2016/08/15 by Olaf.Piesche
#jira UE-34241
Sceneproxy can be nullptr in FDynamicMeshEmitterData::Init if the proxy is being recreated
Change 3089208 on 2016/08/15 by Daniel.Wright
Downsampled separate translucency uses a separate view uniform buffer with correct buffer sizes
* Fixes WorldPosition in downsampled translucency
* View uniform buffer parameters are now cached on the view, to allow recreating the uniform buffer without having to rebuild the entire struct. Currently used by global distance field, downsampled separate translucency.
* Fixed the downsampled translucency depth buffer being full res used together with a smaller color target, now they are both the downsampled res
Change 3089209 on 2016/08/15 by Daniel.Wright
Fixed atmospheric fog on translucency
Change 3089457 on 2016/08/15 by Daniel.Wright
Fixed lighting build failure from UMaterialInstanceDynamic assigned to a mesh that's being exported to Lightmass. The Swarm cache entry is created using the parent's guid, causing multiple MID's with the same parent to acquire a file handle multiple times which fails after the first.
Change 3089549 on 2016/08/15 by Daniel.Wright
UMaterialInterface initializes LightingGuid to something valid - causes UMaterialInstanceDynamic to have a valid LightingGuid so they can be used in lighting builds
Change 3089703 on 2016/08/15 by Daniel.Wright
Custom expression fixup for View.RenderTargetSize
Change 3090546 on 2016/08/16 by Daniel.Wright
Hopeful fix for recycled snapshot view crash
Change 3091202 on 2016/08/16 by Daniel.Wright
Manually clear FViewInfo::CachedViewUniformShaderParameters on creating a snapshot, since memcpy is used to create the snapshot view
[CL 3091931 by Gil Gribb in Main branch]
2016-08-17 11:38:13 -04:00
2020-09-08 17:44:06 -04:00
ViewUniformShaderParameters . GlobalDistanceFieldPageAtlasTexture = OrBlack3DIfNull ( GBlackVolumeTexture - > TextureRHI . GetReference ( ) ) ;
2022-03-01 21:07:45 -05:00
ViewUniformShaderParameters . GlobalDistanceFieldCoverageAtlasTexture = OrBlack3DIfNull ( GBlackVolumeTexture - > TextureRHI . GetReference ( ) ) ;
2022-01-26 08:27:36 -05:00
ViewUniformShaderParameters . GlobalDistanceFieldPageTableTexture = OrBlack3DUintIfNull ( GBlackUintVolumeTexture - > TextureRHI . GetReference ( ) ) ;
2020-09-15 11:03:59 -04:00
ViewUniformShaderParameters . GlobalDistanceFieldMipTexture = OrBlack3DIfNull ( GBlackVolumeTexture - > TextureRHI . GetReference ( ) ) ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3091903)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3072947 on 2016/08/01 by Uriel.Doyon
Texture GUIDs are now included in cooked builds, as they are required by the texture streamer to link build data to in game texture.
#review-3072934 @marcus.wassmer
#jira UE-34045
Change 3073301 on 2016/08/02 by Ben.Woodhouse
Fix for large spotlight culling precision issues, reported on UDN by Aaron Jacobs at Double Fine. For a full description, see the UDN post
https://udn.unrealengine.com/questions/305440/shadowed-light-flicker-caused-by-floating-point-pr.html
#jira UE-34052
Change 3073689 on 2016/08/02 by Ben.Woodhouse
Improved skin postprocess - support for full resolution, with diffuse/spec lighting combined into single RGBA (sharing chroma)
Full res lighting gives less temporal AA flickering, sharper diffuse and specular lighting in the surface (since this is now at full resolution), faster postprocessing if using a 64-bit rendertarget (on NV 980Ti).
Checkerboard rendering is controlled via the r.sss.checkerboard cvar. - 0 is off/full res, 1 is checkerboard, 2 is automatic based on scenecolor (non-checkerboard requires 64bit or more rendertarget w/separate alpha)
Tested/profiled on PC, PS4
Change 3074666 on 2016/08/02 by Daniel.Wright
Fixed stationary skylight brightness
Change 3074667 on 2016/08/02 by Daniel.Wright
Fixed r.ReflectionEnvironmentLightmapMixing
Change 3074687 on 2016/08/02 by Daniel.Wright
Disallowed DrawMaterialToRenderTarget and Begin/EndDrawCanvasToRenderTarget in construction scripts, since they don't work in game. Blutilities can be used to do blueprint rendering in the editor.
Change 3075241 on 2016/08/03 by Rolando.Caloca
DR - Fix linux compile issue & static analysis warning
Change 3075746 on 2016/08/03 by Daniel.Wright
Removed bOverride_AntiAliasingMethod and outdated ini references to PP AntiAliasingMethod
Change 3075783 on 2016/08/03 by Ryan.Brucks
#code.review Marcus.Wassmer
Added two material nodes that return Atmospheric Light Vector and Light Direction using:
View.AtmosphericFogSunColor
View.AtmosphericFogSunDirection
Nodes are called:
AtmosphericLightVector
AtmosphericLightColor
Also changed SceneRendering.cpp so that values will be grabbed from directional lights without needing an Atmospheric Fog actor in the scene.
Change 3075969 on 2016/08/03 by Uriel.Doyon
Material GUIDs are not updated anymore when parents or textures change.
Lighting now uses a hash built from the list of parents, textures and shader functions.
#review-3072980 @marcus.wassmer @daniel.wright
Change 3076116 on 2016/08/03 by Ryan.Brucks
#code.review marcus.wassmer
Fixed typo in the Caption of new Nodes "Atmospheric Light Vector" and "Atmospheric Light Color"
Change 3076456 on 2016/08/03 by Rolando.Caloca
DR - Fix geometry shader gl_Layer for SPIR-V
Change 3076730 on 2016/08/03 by Uriel.Doyon
Added user warning logic for the texture streaming build. Ran in MapCheck, BeginPlay and PreSave.
#review-3072984 @marcus.wassmer
Change 3077616 on 2016/08/04 by Daniel.Wright
Planar reflection show flags can now be edited
Change 3077621 on 2016/08/04 by Daniel.Wright
Changed default Planar Reflection DistanceFromPlaneFadeoutEnd from 600 to 100, which reduces artifacts and is a more intuitive initial setting
Change 3077792 on 2016/08/04 by Daniel.Wright
Fixed an unnecessary sky capture caused by the sky light component owned by the default ASkyLight
Change 3077799 on 2016/08/04 by Daniel.Wright
Skip RF_ArchetypeObject for reflection captures
Change 3077876 on 2016/08/04 by Marc.Olano
Noise material perf improvements
Change random number generator for Gradient-ALU (1.7x perf boost), improve speed of Voronoi noise quality level 3.
Removes integer BBS random number generators. Fewer instructions, but too slow to use (see 1.7x perf boost above)
Change 3077884 on 2016/08/04 by Daniel.Wright
Lighting channels can now be edited on components with static mobility, since dynamic lights can still affect them
Change 3078994 on 2016/08/05 by Simon.Tovey
Fix for UE-34241
Scene proxy ptr was being cached during a downcast.
Inside a call to CreateDynamicData, CheckMaterialUsage_Concurrent() was causing the scene proxy to be recreated an so the cached ptr was stale.
I've fixed the immediate issue but recreating the scene proxy here doesn't seem great. Maybe CheckMaterailUsage() should be rethought a bit.
Change 3079162 on 2016/08/05 by Ben.Woodhouse
Fix for jittering in Paper2D. Was caused by override being ignored due to a change in intiialization order for AA settings.
#jira UE-34091
Change 3079613 on 2016/08/05 by Daniel.Wright
New blueprint function ClearRenderTarget2D, which is the only way to set a render target alpha directly
New blueprint function CreateRenderTarget2D
Change 3079708 on 2016/08/05 by Uriel.Doyon
Fixed crash when building texture streaming on some levels.
Change 3079795 on 2016/08/05 by Uriel.Doyon
Fixed issue with instanced static meshes when building texture streaming.
Fixed typo with func "GetNumTextureStreamingPrimitives"
Change 3079806 on 2016/08/05 by Uriel.Doyon
Enabled PerTexture MipBias. The per texture mip bias now resets to 0 when the texture gets required at low resolution.
New scalability setting named "r.Streaming.LimitPoolSizeToVRAM" enabling the PoolSize to be limited the available VRAM
(according to GPoolSizeVRAMPercentage)
#review-3074662 @marcus.wassmer
Change 3082698 on 2016/08/09 by Daniel.Wright
Copy - CreateRenderTarget2D uses a world context object as owner, allows use in a construction script
Change 3082699 on 2016/08/09 by Daniel.Wright
Changed display name for 'Two Sided' shading model to 'Two Sided Foliage' to make it clear what it's intended to be used for
Change 3083909 on 2016/08/10 by Olaf.Piesche
#jira UE-34106
#jira UE-32784
#jira UE-31198
Reset vertex factories on mesh emitters if mesh has been reimported (if mesh package is dirty)
Change 3084645 on 2016/08/10 by Olaf.Piesche
#jira UE-30398
Fix offset added to particle collision locations.
Change 3084709 on 2016/08/10 by Daniel.Wright
Copy - Scene capture alpha is now inverted to match DrawMaterialToRenderTarget, and to allow compositing with existing render target contents
Added CompositeMode to SceneCapture2D, which can be used to addively accumulate or composite instead of the default overwrite behavior
Added bCaptureOnMovement to SceneCapture, which can be disabled so the only source of scene capturing is a manual capture by calling CaptureScene()
Change 3084783 on 2016/08/10 by Rolando.Caloca
DR - Use the first targeted rhi shader platform as the initial RHI to load on Windows
#jira UE-34510
Change 3084958 on 2016/08/10 by Daniel.Wright
Copy - Reverted cl 2938543 "Lightmass now respects owner bHidden, and bCastHiddenShadow" because it did not have backwards compatibility so breaks content using hidden light cards
Change 3086023 on 2016/08/11 by Marcus.Wassmer
Merging //UE4/Dev-Main@3085468 to Dev-Rendering (//UE4/Dev-Rendering)
#test none
Change 3086778 on 2016/08/11 by Ben.Woodhouse
Workaround for fortnite character rendering issue. Enable checkerboard rendering by default until we can fix properly
#jira UE-34561
Change 3087404 on 2016/08/12 by Rolando.Caloca
DR - Upgrade glslang to 1.0.21.1
- Added some more debug output
Change 3087524 on 2016/08/12 by Rolando.Caloca
DR - vk - Fixed StencilRef, fixed size of RHIReadSurfaceFloatData (but still returns dummy data)
Change 3087663 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix for SRGB; support for mip texture views
Change 3087735 on 2016/08/12 by Daniel.Wright
TextureRenderTarget2D's can now be up to 8192^2. Anything over 2048 pops up an 'are you sure' dialog.
Change 3087750 on 2016/08/12 by Rolando.Caloca
DR - vk - Minor renaming in prep for merge
Change 3087813 on 2016/08/12 by Rolando.Caloca
DR - vk - More minor cleanup
Change 3087819 on 2016/08/12 by Chris.Bunner
Check material function input types directly, no need to traverse connected graph.
#jira UE-32134
Change 3087901 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix RT view to use 1 mip
Fix depth buffer component swizzle
Change 3088193 on 2016/08/12 by Daniel.Wright
DFAO and RTDF shadows are enabled in High and Epic scalability settings by default
Change 3088988 on 2016/08/15 by Rolando.Caloca
DR - Add Accessors
Change 3089104 on 2016/08/15 by Olaf.Piesche
#jira UE-34241
Sceneproxy can be nullptr in FDynamicMeshEmitterData::Init if the proxy is being recreated
Change 3089208 on 2016/08/15 by Daniel.Wright
Downsampled separate translucency uses a separate view uniform buffer with correct buffer sizes
* Fixes WorldPosition in downsampled translucency
* View uniform buffer parameters are now cached on the view, to allow recreating the uniform buffer without having to rebuild the entire struct. Currently used by global distance field, downsampled separate translucency.
* Fixed the downsampled translucency depth buffer being full res used together with a smaller color target, now they are both the downsampled res
Change 3089209 on 2016/08/15 by Daniel.Wright
Fixed atmospheric fog on translucency
Change 3089457 on 2016/08/15 by Daniel.Wright
Fixed lighting build failure from UMaterialInstanceDynamic assigned to a mesh that's being exported to Lightmass. The Swarm cache entry is created using the parent's guid, causing multiple MID's with the same parent to acquire a file handle multiple times which fails after the first.
Change 3089549 on 2016/08/15 by Daniel.Wright
UMaterialInterface initializes LightingGuid to something valid - causes UMaterialInstanceDynamic to have a valid LightingGuid so they can be used in lighting builds
Change 3089703 on 2016/08/15 by Daniel.Wright
Custom expression fixup for View.RenderTargetSize
Change 3090546 on 2016/08/16 by Daniel.Wright
Hopeful fix for recycled snapshot view crash
Change 3091202 on 2016/08/16 by Daniel.Wright
Manually clear FViewInfo::CachedViewUniformShaderParameters on creating a snapshot, since memcpy is used to create the snapshot view
[CL 3091931 by Gil Gribb in Main branch]
2016-08-17 11:38:13 -04:00
}
void FViewInfo : : SetupGlobalDistanceFieldUniformBufferParameters ( FViewUniformShaderParameters & ViewUniformShaderParameters ) const
{
check ( GlobalDistanceFieldInfo . bInitialized ) ;
2022-04-22 19:55:41 -04:00
for ( int32 Index = 0 ; Index < GlobalDistanceField : : MaxClipmaps ; Index + + )
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3091903)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3072947 on 2016/08/01 by Uriel.Doyon
Texture GUIDs are now included in cooked builds, as they are required by the texture streamer to link build data to in game texture.
#review-3072934 @marcus.wassmer
#jira UE-34045
Change 3073301 on 2016/08/02 by Ben.Woodhouse
Fix for large spotlight culling precision issues, reported on UDN by Aaron Jacobs at Double Fine. For a full description, see the UDN post
https://udn.unrealengine.com/questions/305440/shadowed-light-flicker-caused-by-floating-point-pr.html
#jira UE-34052
Change 3073689 on 2016/08/02 by Ben.Woodhouse
Improved skin postprocess - support for full resolution, with diffuse/spec lighting combined into single RGBA (sharing chroma)
Full res lighting gives less temporal AA flickering, sharper diffuse and specular lighting in the surface (since this is now at full resolution), faster postprocessing if using a 64-bit rendertarget (on NV 980Ti).
Checkerboard rendering is controlled via the r.sss.checkerboard cvar. - 0 is off/full res, 1 is checkerboard, 2 is automatic based on scenecolor (non-checkerboard requires 64bit or more rendertarget w/separate alpha)
Tested/profiled on PC, PS4
Change 3074666 on 2016/08/02 by Daniel.Wright
Fixed stationary skylight brightness
Change 3074667 on 2016/08/02 by Daniel.Wright
Fixed r.ReflectionEnvironmentLightmapMixing
Change 3074687 on 2016/08/02 by Daniel.Wright
Disallowed DrawMaterialToRenderTarget and Begin/EndDrawCanvasToRenderTarget in construction scripts, since they don't work in game. Blutilities can be used to do blueprint rendering in the editor.
Change 3075241 on 2016/08/03 by Rolando.Caloca
DR - Fix linux compile issue & static analysis warning
Change 3075746 on 2016/08/03 by Daniel.Wright
Removed bOverride_AntiAliasingMethod and outdated ini references to PP AntiAliasingMethod
Change 3075783 on 2016/08/03 by Ryan.Brucks
#code.review Marcus.Wassmer
Added two material nodes that return Atmospheric Light Vector and Light Direction using:
View.AtmosphericFogSunColor
View.AtmosphericFogSunDirection
Nodes are called:
AtmosphericLightVector
AtmosphericLightColor
Also changed SceneRendering.cpp so that values will be grabbed from directional lights without needing an Atmospheric Fog actor in the scene.
Change 3075969 on 2016/08/03 by Uriel.Doyon
Material GUIDs are not updated anymore when parents or textures change.
Lighting now uses a hash built from the list of parents, textures and shader functions.
#review-3072980 @marcus.wassmer @daniel.wright
Change 3076116 on 2016/08/03 by Ryan.Brucks
#code.review marcus.wassmer
Fixed typo in the Caption of new Nodes "Atmospheric Light Vector" and "Atmospheric Light Color"
Change 3076456 on 2016/08/03 by Rolando.Caloca
DR - Fix geometry shader gl_Layer for SPIR-V
Change 3076730 on 2016/08/03 by Uriel.Doyon
Added user warning logic for the texture streaming build. Ran in MapCheck, BeginPlay and PreSave.
#review-3072984 @marcus.wassmer
Change 3077616 on 2016/08/04 by Daniel.Wright
Planar reflection show flags can now be edited
Change 3077621 on 2016/08/04 by Daniel.Wright
Changed default Planar Reflection DistanceFromPlaneFadeoutEnd from 600 to 100, which reduces artifacts and is a more intuitive initial setting
Change 3077792 on 2016/08/04 by Daniel.Wright
Fixed an unnecessary sky capture caused by the sky light component owned by the default ASkyLight
Change 3077799 on 2016/08/04 by Daniel.Wright
Skip RF_ArchetypeObject for reflection captures
Change 3077876 on 2016/08/04 by Marc.Olano
Noise material perf improvements
Change random number generator for Gradient-ALU (1.7x perf boost), improve speed of Voronoi noise quality level 3.
Removes integer BBS random number generators. Fewer instructions, but too slow to use (see 1.7x perf boost above)
Change 3077884 on 2016/08/04 by Daniel.Wright
Lighting channels can now be edited on components with static mobility, since dynamic lights can still affect them
Change 3078994 on 2016/08/05 by Simon.Tovey
Fix for UE-34241
Scene proxy ptr was being cached during a downcast.
Inside a call to CreateDynamicData, CheckMaterialUsage_Concurrent() was causing the scene proxy to be recreated an so the cached ptr was stale.
I've fixed the immediate issue but recreating the scene proxy here doesn't seem great. Maybe CheckMaterailUsage() should be rethought a bit.
Change 3079162 on 2016/08/05 by Ben.Woodhouse
Fix for jittering in Paper2D. Was caused by override being ignored due to a change in intiialization order for AA settings.
#jira UE-34091
Change 3079613 on 2016/08/05 by Daniel.Wright
New blueprint function ClearRenderTarget2D, which is the only way to set a render target alpha directly
New blueprint function CreateRenderTarget2D
Change 3079708 on 2016/08/05 by Uriel.Doyon
Fixed crash when building texture streaming on some levels.
Change 3079795 on 2016/08/05 by Uriel.Doyon
Fixed issue with instanced static meshes when building texture streaming.
Fixed typo with func "GetNumTextureStreamingPrimitives"
Change 3079806 on 2016/08/05 by Uriel.Doyon
Enabled PerTexture MipBias. The per texture mip bias now resets to 0 when the texture gets required at low resolution.
New scalability setting named "r.Streaming.LimitPoolSizeToVRAM" enabling the PoolSize to be limited the available VRAM
(according to GPoolSizeVRAMPercentage)
#review-3074662 @marcus.wassmer
Change 3082698 on 2016/08/09 by Daniel.Wright
Copy - CreateRenderTarget2D uses a world context object as owner, allows use in a construction script
Change 3082699 on 2016/08/09 by Daniel.Wright
Changed display name for 'Two Sided' shading model to 'Two Sided Foliage' to make it clear what it's intended to be used for
Change 3083909 on 2016/08/10 by Olaf.Piesche
#jira UE-34106
#jira UE-32784
#jira UE-31198
Reset vertex factories on mesh emitters if mesh has been reimported (if mesh package is dirty)
Change 3084645 on 2016/08/10 by Olaf.Piesche
#jira UE-30398
Fix offset added to particle collision locations.
Change 3084709 on 2016/08/10 by Daniel.Wright
Copy - Scene capture alpha is now inverted to match DrawMaterialToRenderTarget, and to allow compositing with existing render target contents
Added CompositeMode to SceneCapture2D, which can be used to addively accumulate or composite instead of the default overwrite behavior
Added bCaptureOnMovement to SceneCapture, which can be disabled so the only source of scene capturing is a manual capture by calling CaptureScene()
Change 3084783 on 2016/08/10 by Rolando.Caloca
DR - Use the first targeted rhi shader platform as the initial RHI to load on Windows
#jira UE-34510
Change 3084958 on 2016/08/10 by Daniel.Wright
Copy - Reverted cl 2938543 "Lightmass now respects owner bHidden, and bCastHiddenShadow" because it did not have backwards compatibility so breaks content using hidden light cards
Change 3086023 on 2016/08/11 by Marcus.Wassmer
Merging //UE4/Dev-Main@3085468 to Dev-Rendering (//UE4/Dev-Rendering)
#test none
Change 3086778 on 2016/08/11 by Ben.Woodhouse
Workaround for fortnite character rendering issue. Enable checkerboard rendering by default until we can fix properly
#jira UE-34561
Change 3087404 on 2016/08/12 by Rolando.Caloca
DR - Upgrade glslang to 1.0.21.1
- Added some more debug output
Change 3087524 on 2016/08/12 by Rolando.Caloca
DR - vk - Fixed StencilRef, fixed size of RHIReadSurfaceFloatData (but still returns dummy data)
Change 3087663 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix for SRGB; support for mip texture views
Change 3087735 on 2016/08/12 by Daniel.Wright
TextureRenderTarget2D's can now be up to 8192^2. Anything over 2048 pops up an 'are you sure' dialog.
Change 3087750 on 2016/08/12 by Rolando.Caloca
DR - vk - Minor renaming in prep for merge
Change 3087813 on 2016/08/12 by Rolando.Caloca
DR - vk - More minor cleanup
Change 3087819 on 2016/08/12 by Chris.Bunner
Check material function input types directly, no need to traverse connected graph.
#jira UE-32134
Change 3087901 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix RT view to use 1 mip
Fix depth buffer component swizzle
Change 3088193 on 2016/08/12 by Daniel.Wright
DFAO and RTDF shadows are enabled in High and Epic scalability settings by default
Change 3088988 on 2016/08/15 by Rolando.Caloca
DR - Add Accessors
Change 3089104 on 2016/08/15 by Olaf.Piesche
#jira UE-34241
Sceneproxy can be nullptr in FDynamicMeshEmitterData::Init if the proxy is being recreated
Change 3089208 on 2016/08/15 by Daniel.Wright
Downsampled separate translucency uses a separate view uniform buffer with correct buffer sizes
* Fixes WorldPosition in downsampled translucency
* View uniform buffer parameters are now cached on the view, to allow recreating the uniform buffer without having to rebuild the entire struct. Currently used by global distance field, downsampled separate translucency.
* Fixed the downsampled translucency depth buffer being full res used together with a smaller color target, now they are both the downsampled res
Change 3089209 on 2016/08/15 by Daniel.Wright
Fixed atmospheric fog on translucency
Change 3089457 on 2016/08/15 by Daniel.Wright
Fixed lighting build failure from UMaterialInstanceDynamic assigned to a mesh that's being exported to Lightmass. The Swarm cache entry is created using the parent's guid, causing multiple MID's with the same parent to acquire a file handle multiple times which fails after the first.
Change 3089549 on 2016/08/15 by Daniel.Wright
UMaterialInterface initializes LightingGuid to something valid - causes UMaterialInstanceDynamic to have a valid LightingGuid so they can be used in lighting builds
Change 3089703 on 2016/08/15 by Daniel.Wright
Custom expression fixup for View.RenderTargetSize
Change 3090546 on 2016/08/16 by Daniel.Wright
Hopeful fix for recycled snapshot view crash
Change 3091202 on 2016/08/16 by Daniel.Wright
Manually clear FViewInfo::CachedViewUniformShaderParameters on creating a snapshot, since memcpy is used to create the snapshot view
[CL 3091931 by Gil Gribb in Main branch]
2016-08-17 11:38:13 -04:00
{
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 4041614)
#lockdown Nick.Penwarden
============================
MAJOR FEATURES & CHANGES
============================
Change 3774677 by Arne.Schober
DR - Deprecated SetLocal from the RHICmdlist
Fixed some unnecessary PSO collisions.
Change 3809579 by Chris.Bunner
Back out changelist 3774677.
#jira UE-53483
Change 3810363 by Mark.Satterthwaite
More random fixes to mtlpp: most important is the extension to Buffer that allows creation of sub-buffers that are merely views onto a sub-range of the parent. These sub-buffers are valid to use throughout the mtlpp API with two exceptions: they may not be used for visibilityResultsBuffers and Set*BufferOffset functions cannot take this offset into account (as the encoder does not hold onto the buffers and I don't want it to). In the case of Set*BufferOffset the caller has to know what is going on and in the case of visibilityResultsBuffers it'll just assert as it isn't sensible.
This makes it *much* easier to do things like sub-buffer allocation, though the caller must be aware of the alignment restrictions of their intended usage as they are not possible to enforce. For example, a call to SetVertexBuffer requires an offset alignment must match the alignment of the data-type in the shader for "device" resources, or for "constant" data it must be max(4, sizeof(datatype)) on iOS and 256 on macOS. This should allow for much more tightly packed sub-allocations than earlier approaches, though older drivers (e.g. Mac OS X 10.11) enforce only the coarser "constant" data restriction everywhere.
Change 3810407 by Marcus.Wassmer
PR #4322: ShadowSetup Bug Fix: Only stencil mask drawn meshes (Contributed by DSDambuster)
Change 3810676 by Guillaume.Abadie
Makes r.Test.SecondaryUpscaleOverride work with any arbitrary pixel size.
Change 3810696 by Guillaume.Abadie
Adds support for #include "../MyFile.ush" in the shader compiler.
Change 3810698 by Guillaume.Abadie
Implements enum class based shader permutation dimension.
Change 3810699 by Guillaume.Abadie
Implements Diaphragm DOF ground work.
Change 3811536 by Guillaume.Abadie
Pulls the trigger on CircleDOF's setup pass for DiaphragmDOF.
Change 3811958 by Mark.Satterthwaite
More fixes for mtlpp.
Change 3811964 by Mark.Satterthwaite
Only views onto a mtlpp::Buffer should return a valid parent-buffer.
Change 3812604 by Guillaume.Abadie
Changes Diaphragm DOF's source file layout.
Change 3812827 by Mark.Satterthwaite
More missing/broken functionality in mtlpp fixed and fixed obvious leaks.
Change 3812920 by Guillaume.Abadie
Adds support for per mip level UAV in FSceneRenderTarget.
Change 3812926 by Mark.Satterthwaite
Change the way we handle mtlpp resource construction to avoid leaks.
Change 3812960 by Rolando.Caloca
DR - vk - Disable DFGI
Change 3812968 by Rolando.Caloca
DR - Linker fix
Change 3813318 by Mark.Satterthwaite
Fix linear texture allocation from a buffer sub-view.
Change 3813326 by Mark.Satterthwaite
Fix another Metal mtlpp sub-buffer allocation failure.
Change 3813328 by Guillaume.Abadie
Removes global samplers in TAA for GL4, Vulkan and Switch.
Change 3813937 by Rolando.Caloca
DR - Fix logs not getting dumped when r.DumpSCWQueuedJobs is on
Change 3813947 by Rolando.Caloca
DR - noshaderworker should override r.XGEShaderCompile
Change 3817017 by Uriel.Doyon
Fixed texture editor black screen
#jira UE-53653
Change 3818568 by Rolando.Caloca
DR - Fix log when shader jobs crash
- Move log10 to common
- Added COMPILER_VULKAN define
Change 3818603 by Uriel.Doyon
Fix to static analysis warning
Change 3818623 by Rolando.Caloca
DR - Workaround hlslcc loop unrolling bug
Change 3819070 by Uriel.Doyon
Fix to stat duplication.
Change 3819105 by Uriel.Doyon
Refactored volume sample shader to avoid using texture dimension.
Change 3819136 by Rolando.Caloca
DR - vk - Per platform files (empty)
Change 3819180 by Rolando.Caloca
DR - vk - Move defines out of config into per platform
Change 3819247 by Rolando.Caloca
DR - vk - Remove more defines into platform settings
Change 3819318 by Rolando.Caloca
DR - vk - Fixes for linking
Change 3819868 by Rolando.Caloca
DR - vk - Linux & Android fixes
Change 3819873 by Guillaume.Abadie
Adds support for PermutationId on r.DumpShaderDebugInfo=1
Change 3819940 by Rolando.Caloca
DR - vk - Fix Linux issues
Change 3819956 by Rolando.Caloca
DR - vk - Invalid check
Change 3819961 by Michael.Lentine
Hide attributes when plugin is not present
Change 3819980 by Rolando.Caloca
DR - vk - Standard validation always
Change 3820039 by Rolando.Caloca
DR - vk - Fix invalid ensure
Change 3820326 by Rolando.Caloca
DR - vk - Linux compile fix
Change 3820422 by Michael.Lentine
Add back GBufferAO.
Change 3820433 by Rolando.Caloca
DR - Fix D3D12 crash on 20 thread (10x2 cores) machines
Change 3821677 by Rolando.Caloca
DR - vk - Win32 compile fix
Change 3821961 by Rolando.Caloca
DR - Vulkan uses real UB by default on non-Android
Change 3821968 by Rolando.Caloca
DR - vk - Update glslang 1.0.65.1
Change 3821969 by Uriel.Doyon
Added support for stat groups that must be sorted by name. Defined by DECLARE_STATS_GROUP_SORTBYNAME.
Change 3821983 by Rolando.Caloca
DR - vk - Change to static array (0.1ms on 10k draw calls)
Change 3824141 by Rolando.Caloca
DR - vk - Fix static analysis
- Bumped up some (c) 2017->2018
Change 3824355 by Rolando.Caloca
DR - vk - Accessor to find out if a cmd buffer has been submitted
Change 3824420 by Rolando.Caloca
DR - Sanity check number of queries per batch on D3D11 as to not break other RHIs
Change 3824463 by Rolando.Caloca
DR - Removed dummy ensure for D3D12
Change 3824609 by Rolando.Caloca
DR - vk - Linux compile fix
Change 3826074 by Mark.Satterthwaite
Start IMP-caching the various descriptor types in mtlpp.
Change 3826098 by Rolando.Caloca
DR - vk - Dump layer compile fixes
Change 3826113 by Rolando.Caloca
DR - vk - Missing dump functions
Change 3826302 by Rolando.Caloca
DR - vk - Compile fix
- Change dump handles to %p
Change 3826635 by Mark.Satterthwaite
Forward declarations required for mtlpp compilation without exposing Metal headers - plus fixes to the mtlpp test compiler.
Change 3827072 by Mark.Satterthwaite
Switch some more mtlpp descriptors over to IMPTables from objc_msgSend.
Change 3827909 by Guillaume.Abadie
Replaces diaphragm DOF's prefiltering with LDS bank coherent bilateral reduction, and implements 1/8 res background gathering pass.
Change 3827952 by Guillaume.Abadie
Updates copy right to year 2018 on diaphragm DOF's new files.
Change 3828055 by Rolando.Caloca
DR - vk - Rename in prep for changes
Change 3828229 by Guillaume.Abadie
Avoids to log multiple time global shader type name that have multiple permutations when verifying global shader map.
Change 3828427 by Guillaume.Abadie
Reimplements Max3x3 gathering post filtering for Diaphragm DOF with proper shader permutation.
Change 3829979 by Guillaume.Abadie
Fixes a color NaN source in diaphragm DOF's TAA pass.
Change 3830116 by Rolando.Caloca
DR - vk - Fix GPU queries/frame time on old system
- New system in place, disabled temporarily
Change 3830169 by Rolando.Caloca
DR - vk - Fix async pso creation crash
Change 3830193 by Rolando.Caloca
DR - vk - CPU RHI thread improvement
Change 3830291 by Guillaume.Abadie
Automatically lower the number of gathering rings on background half res gather pass as far CoC is getting smaller.
Change 3830300 by Rolando.Caloca
DR - vk - Static analysis fix: Split VulkanCommon.h out of VulkanConfiguration.h
Change 3830589 by Mark.Satterthwaite
In mtlpp cache the IMPTables for all the Metal @protocol's that are dependent on the MTLDevice, this avoids a mutex & map lookup. Also make all the concrete types store their IMPTable statically as it won't change.
Change 3830793 by Mark.Satterthwaite
Fix a small number of bugs introduced with the mtlpp descriptor and table caching.
Change 3831491 by Jian.Ru
Fix driver version unknown
#jira UE-53688
Change 3832335 by Rolando.Caloca
DR - vk - Change include
Change 3832550 by Rolando.Caloca
DR - vk - Occlusion query rewrite WIP
Change 3832589 by Rolando.Caloca
DR - vk - Minor refactor to pools in prep for timestamps
Change 3832618 by Rolando.Caloca
DR - vk - Do not block timestamp queries
Change 3832636 by Rolando.Caloca
DR - vk - Fix old timestamp queries
Change 3833138 by Rolando.Caloca
DR - vk - Fix timestamp queries
Change 3833249 by Rolando.Caloca
DR - vk - Test lock
Change 3833667 by Rolando.Caloca
DR - vk - Old queries wait on the RHI thread now instead of the driver (disabled)
Change 3833907 by Daniel.Wright
Fixed NextStartOffset UAV index out of bounds
Change 3833918 by Daniel.Wright
D3D12 RHI: only refcount uniform buffers if GRHINeedsExtraDeletionLatency is false, which is no longer the case for PC or Xbox. The refcounting was heavy on performance as reported by a licensee because FRHIResource uses atomics for refcounting, which is only necessary when GRHINeedsExtraDeletionLatency is disabled.
Change 3834852 by Rolando.Caloca
DR - vk - Missing file
Change 3834858 by Guillaume.Abadie
Implements r.DOF.MinimalFullresBlurringRadius
Change 3834979 by Rolando.Caloca
DR - vk - Fix
Change 3836117 by Rolando.Caloca
DR - vk - Update to 1.0.65.1
Change 3836122 by Rolando.Caloca
DR - vk - Added r.Vulkan.SubmitOcclusionBatchCmdBuffer
- Added new error codes/messages
Change 3836421 by Mark.Satterthwaite
For the purposes of debugging and conformance testing mtlpp make it possible to compile *without* the IMP cache so that we call the underlying Objective-C.
Change 3836896 by Uriel.Doyon
Fixed concurrency and exit issues around d3d12 pipeline states on windows.
Change 3837385 by Rolando.Caloca
DR - vk - Dump memory on OOM
Change 3837427 by Rolando.Caloca
DR - vk - Change some arrays to array views
Change 3837800 by Guillaume.Abadie
Implements SHADER_PERMUTATION_RANGE_INT to make contiguous integer permutations that does not start to 0.
Change 3838128 by Rolando.Caloca
DR - vk - Support for non-cached memory types
Change 3838540 by Guillaume.Abadie
Refactors Diaphragm DOF's CoC tile buffer under a single API for better maintainability.
Change 3838731 by Rolando.Caloca
DR - vk - Descriptor pools per command buffer pool (turned off)
Change 3838961 by Rolando.Caloca
DR - vk - Use ring buffer for per frame uniform buffers
- Enable descriptor pools per layout recycled per command buffer
Change 3839087 by Rolando.Caloca
DR - vk - Compile fixes for Android
Change 3839106 by Marcus.Wassmer
PR #4413: Removing unnecessary call to FString::ToLower (Contributed by gsfreema)
Change 3839252 by Mark.Satterthwaite
Fix mtlpp::Resource move operators.
Change 3839426 by Marcus.Wassmer
Duplicate 380972
Make PC GPU Benchmarks more reliable
Change 3840041 by Guillaume.Abadie
Fixes shader compilation failure in TAA with alpha channel through post processing support.
Change 3840257 by Chris.Bunner
Swapping a mul() to * in HLSLTranslator::Dot to allow scalar transformations per a UDN ticket.
Change 3840308 by Rolando.Caloca
DR - vk - Support for UB & non-UB on emulation mode
Change 3840586 by Rolando.Caloca
DR - Copy 3840577
Fix for CPUs with more than 16 cores
Change 3840671 by Rolando.Caloca
DR - vk - Copy from 3840663
Fix for layout ensure on HMD projects on Vulkan
Change 3840980 by Rolando.Caloca
DR - vk - Android compile fixes
Change 3841989 by Guillaume.Abadie
Slices Diaphragm DOF's Gather pass in multi shader files, and CFLAG_StandardOptimization flag for faster iteration time.
Change 3842216 by Guillaume.Abadie
Fixes DDOF's foreground alpha channel.
Change 3842217 by Guillaume.Abadie
Implements r.DOF.MaximalForegroundBlurringRadius
Change 3842353 by Guillaume.Abadie
Allows to disable foreground gathering with r.DOF.MaximalForegroundBlurringRadius=0
Change 3842747 by Rolando.Caloca
DR - vk - Missing use of GPoolSizeVRAMPercentage
- Support for smaller allocations if page size is not available
Change 3842791 by Rolando.Caloca
DR - vk - Use 95% of available GPU memory to handle some fragmentation
Change 3843690 by Guillaume.Abadie
Fixes diaphragm DOF's foreground after all this refactoring.
Change 3844439 by Guillaume.Abadie
Improves Coc dilate pass to make the gather pass as fast as possible, but still without artifacts caused by the fast gathering optimisation.
Change 3844946 by Mark.Satterthwaite
rd_route v1.1.1 with attached TPS approval.
For macOS function interposition which is useful for debugging and the occasional workaround.
Change 3845164 by Mark.Satterthwaite
Add LLM support for macOS, including tracking of memory allocated in Objective-C. This makes use of runtime method swizzling in the Objective-C runtime and the rd_route library I added for Richard Wallis, which allows for arbitrary runtime function interposition and allows me to hook the custom allocators used in Apple's many Objective-C frameworks on which the whole macOS edifice is built. Objective-C objects are charged to the calling scope as they are too common to impose their own without murdering frame rate.
We would need a TPS approval for an iOS function interposition library for this to work fully on iOS, if desired in the short term discarding LowLevelFree events that aren't in the map rather than asserting will workaround the problem.
Change 3845849 by Marcus.Wassmer
Fix clang and some normal refactor errors
Change 3846026 by Rolando.Caloca
DR - vk - Descriptor set allocation scheme rewrite
- Type hash for each pool
- Desc sets Pool on device
Change 3846169 by Rolando.Caloca
DR - vk - Remove old code for non-layout descriptor set pools
Change 3846205 by Mark.Satterthwaite
Disambiguate the PatchControlPointOut struct definitions in Metal tessellation shaders at Apple's suggestion to avoid a metallib gotcha.
Change 3846346 by Arne.Schober
DR - Missing Vector instructions
Change 3847037 by Arne.Schober
DR - Fix issue with GPU skincache where the offset of the clothbuffer is not relative to the offset of the actual vertexbuffer.
Fixed MorphTarget Skincache Offset mixxup
Change 3847275 by Marcus.Wassmer
Copying MGPU to Dev-Rendering (//UE4/Dev-Rendering)
Change 3847464 by Rolando.Caloca
DR - vk - Fix static analysis warning
Change 3847707 by Michael.Lentine
Only use MorphTargetOffset when the shader enables morph targets.
Change 3848533 by Richard.Wallis
Handle Metal adding FirstInstance into [[ instance_id ]] which is different to other APIs. SV_InstanceID and SV_VertexID should now have their respective base instance and base vertex ID's subtracted before use in the shader.
#jira UE-51716
Change 3848625 by Richard.Wallis
Compile Fix
Change 3848725 by Rolando.Caloca
DR - Remove use of Build/SetLocalGraphicsPipelineState
Change 3848797 by Rolando.Caloca
DR - Deprecate Build/SetLocalGraphicsPipelineState
Change 3849237 by Arne.Schober
DR - AddCustom Ver for ModelVertex Serialization
Change 3851247 by Rolando.Caloca
DR - vk - Util functions
Change 3851523 by Arne.Schober
DR - Update Reflection Comparission shot from the BuildFarm.
Change 3851859 by Rolando.Caloca
DR - vk - Skip loader
Change 3851889 by Krzysztof.Narkowicz
Removed lights with lighting channels out of tiled deferred light list. Tiled deferred lights do not support lighting channels and it's wasn't worth to add extra complexity to this shader in order support this special case.
#jira UE-51512
Change 3852181 by Rolando.Caloca
DR - vk - Linux compile fix
Change 3852547 by Uriel.Doyon
Fixed Pre-Exposure shader compilation and Temporal AA issue.
#jira UE-54276
Change 3852637 by Arne.Schober
DR - Fixing Normal Automated Test Result
Change 3853167 by Richard.Wallis
AvfPlayer - support for streaming media. Due to an operator new/delete mismatch in Apples CFNetwork - we've had to change out one of that framework allocators using rd_route to avoid the memory corruption.
#jira UE-35637
Change 3853447 by Chris.Bunner
Fixing typos.
Change 3853645 by Krzysztof.Narkowicz
Fixed light functions on subsurface materials
Removed strange code from blending between static and dynamic shadows
#jira UE-50275
Change 3853660 by Rolando.Caloca
DR - Fix OpenGL overwriting texture samplers on forward renderer
Change 3853945 by Mark.Satterthwaite
Duplicate #3831616
Fix the black ground scattering on Metal - we've had issues with the atmospheric fog calculations for a long time - one or more intermediate operations generates different precision on Metal so we end up passing -ve values into sqrt which then generates NaN/INF. For Metal when compiling this file and this file only #define sqrt() to sqrt(abs()) so that we don't see anymore unexpected black in atmospheric rendering. This is far from ideal but I don't want to make abs all inputs into every sqrt because AFAIK this is the only case where we have an issue, and until we to investigate each intermediate calculation that isn't ridiculously, soul-crushingly tedious, it isn't practical to identify the source of the error.
#jira UE-53720
Change 3853966 by Mark.Satterthwaite
Duplicate #3835852
Fix tessellation shaders in Metal with Manual Vertex Fetch enabled:
- The control points idnex buffer shouldn't collide with anything else.
- We can't use the optimisation of loading texture width & height from the buffer meta-table in tessellation shaders as the combined stages don't guarantee not to clobber unused buffer slots and screw it up when we use linear textures.
#jira UE-53851
Change 3854250 by Uriel.Doyon
Fix fbx automation tests
Change 3854736 by Uriel.Doyon
Added a tooltip to the EV100 slider in the exposure menu.
Using game settings now disables the slider.
#jira UE-53945
Change 3855047 by Jian.Ru
Fix DFAO getting NANs when samples out of ViewRect
#jira UE-54403
Change 3858197 by Krzysztof.Narkowicz
View frustum shadow caster culling for pointlights/spotlights
#jira UE-54381
Change 3860081 by Krzysztof.Narkowicz
Tighter bounding sphere for a spotlight
Replaced IntersectSphere(LightProxy->Origin, LightProxy->Radius) with LightProxy->SphereBounds for tighter culling of spotlights
Directional light GetBoundingSphere() now everywhere returns Sphere((0,0,0),HALF_WORLD_MAX) for consistency and proper SphereBounds
#jira UE-54258
Change 3860324 by Mark.Satterthwaite
Update the macOS deployment target version to 10.12 from 10.11 as we officially ended support for El Capitan a while ago. Should mean that libraries compiled for 10.12 and up won't cause link warnings.
Change 3860945 by Arne.Schober
DR - Fix not releaseing SRV on render thread for FPositionVertexBuffer, FStaticMeshVertexBuffer, FColorVertexBuffer, FStaticMeshInstanceBuffer.
#jira UE-54587
Change 3861129 by Jian.Ru
Prevent distance culled objects from casting distance field direct shadows
#jira UE-54533
Change 3861502 by Jian.Ru
Exclude distance culled objects from DFAO calculation
#jira UE-54533
Change 3862243 by Krzysztof.Narkowicz
Changed radius of a directional light's bounding sphere from HALF_WORLD_MAX to WORLD_MAX in order to encopass entire WORLD_MAX box
Change 3863476 by Krzysztof.Narkowicz
Added BuildReflections option to ResavePackages commandlet
#jira UE-54581
Change 3863717 by Rolando.Caloca
DR - vk - Missed using pipeline cache on compute PSOs
Change 3865332 by Arne.Schober
DR - Fix UE-52356 Bone Weight
Change 3866220 by Rolando.Caloca
DR - vk - Fixed GetNativeResource missing on textures
- Added support for -preferNvidia|AMD|Intel
- Added VulkanRHIBridge.h
- Minor fixes
Change 3866222 by Rolando.Caloca
DR - vk - Missed file
Change 3866951 by Krzysztof.Narkowicz
Fixed FreezeRendering on non editor builds: ComputeAndMarkRelevanceForViewParallel was calling FrozenMatricesGuard on multiple threads, reading and writing view matrices state in parallel.
#jira UE-53640
Change 3867231 by Guillaume.Abadie
Adds alpha mode to allow the tonemapper to passthrough the alpha channel for broadcast industry.
Change 3867233 by Guillaume.Abadie
Fixes a compilation failures in TAAU with r.PostProcessing.PropagateAlpha==2
Change 3867594 by Daniel.Wright
Removed EditorOnlyDefaultMaterials, which added 79s of shader compilation during startup
Added a dialog when opening the Material Editor on a Default Material, warning of advanced workflow
Preventing Material Editor Apply or Save for a Default Material when the preview material has compilation errors
Change 3870048 by Daniel.Wright
Cleaned up formatting in TranslucentRendering from merges
Change 3870106 by Krzysztof.Narkowicz
Fixed some FArchive Tell()/Seek() 64bit->32bit truncations
Change 3870211 by Rolando.Caloca
DR - vk - Added -vulkanvalidation=N/-vulkanstandardvalidation/-novulkanstandardvalidation to set validation layer behaviour from cmd line
Change 3870225 by Rolando.Caloca
DR - vk - Some platforms do not use a standard swapchain
Change 3870267 by Arne.Schober
DR - SafeRelease SRVs that might be hold by the Vertexfactories (maybe due to indirect use in GlobalResources)
Note that the VFs are not owners of the data, e.g the underlying Buffers might be released before this and this reference counting should be uneccessary
Change 3870647 by Daniel.Wright
Moved FogRendering.h to Renderer
Change 3872130 by Krzysztof.Narkowicz
Disable USE_GLOBAL_CLIP_PLANE for MATERIAL_DOMAIN_POSTPROCESS and MERIAL_DOMAIN_UI
Merging GitHub Pull request #4459
"When material domain is not needing global clip plane there is no need to generate any code involving it. This does not alter output but removes lot of code at vertex shader and pixel shaders. At least on mobile rendered was actually generating clipping code for ui materials."
#jira UE-54616
Change 3872145 by Rolando.Caloca
DR - vk - Optional SupportsMarkersWithoutExtension
Change 3872404 by Uriel.Doyon
Added some guards when streaming virtual textures.
Fixed optimized UCanvasRenderTarget2D::RepaintCanvas() to prevent resolving the texture twice.
Fixed bad mipmap generation with UCanvasRenderTarget2D.
Change 3872507 by Arne.Schober
Back out changelist 3870267
Change 3874176 by Ben.Marsh
IncludeTool: Add an flag to prevent scanning source files for exported symbols.
Change 3874935 by Krzysztof.Narkowicz
Fixed white thumbnails and other issues with sky lighting on ES3_1 path, by disabling GGX prefiltering, as mobile path doesn't have a single cubemap with all initialized mips. Instead it ping-pongs between 2 partially initialized.
#jira UE-54656
Change 3875710 by Daniel.Wright
Renamed uniform buffer member macros to be much shorter for readability
Change 3876665 by Guillaume.Abadie
Cherry-pick 3870715: Implements DOF's hybrid scatering bare bones.
Change 3876666 by Guillaume.Abadie
Cherry-pick 3871786: DOF hybrid scatering: fixes NaN source, transition to gather on close to screen edge and low intensity.
Change 3876677 by Guillaume.Abadie
Cherry-pick 3872348: Implements neighbor comparison for DOF's scattering compilation pass.
Change 3876680 by Guillaume.Abadie
Cherry-pick 3872357: Oups... fixes build...
Change 3876683 by Guillaume.Abadie
Cherry-pick 3872475: Controls number of mip to generate with DOF's reduce pass.
Change 3876687 by Guillaume.Abadie
Cherry-pick 3874104: Fixes various bugs in diaphragm DOF's hybrid scattering.
Change 3876690 by Guillaume.Abadie
Cherry-pick 3874144: Packs multiple DOF scattering group into same draw instance.
Change 3876694 by Guillaume.Abadie
Cherry-pick 3874275: Switches hybrid scattering with indexed indirect draw call to reduce scatter vertex shader invocation.
Change 3876695 by Guillaume.Abadie
Cherry-pick 3874674: Records min and max coc on DOF's setup's draw event.
Change 3876783 by Rolando.Caloca
DR - Static analysis fix
Change 3876845 by Guillaume.Abadie
Implements USceneCaptureComponent::ProfilingEventName
Change 3877197 by Rolando.Caloca
DR - vk - OQ fixes (disabled)
Change 3877428 by Krzysztof.Narkowicz
Merged with tiny tweaks Ansel photography plugin improvements from Adam Moss (GitHub pull request #4426):
-The free-roaming photography camera has new constraints by default, i.e. it can't pass through walls
-Photography session can be started and stopped programmatically, e.g. making it possible to bind photography to an alternative hotkey or button combo. This was an often-requested feature.
-Tweakables and utilities are now exposed through a Blueprint Function Library (rather than direct manipulation of console variables)
-The Ansel photography session UI now exposes some engine effect tweakables as sliders. For example, if the game is using depth-of-field then sliders are made available to allow the photographer to change the focal depth etc. The developer may suppress this behavior through the Blueprint Function Library.
-Letterboxing is now removed during multi-part capture, d'oh.
-Tiled shots are taken at full resolution even if ScreenPercentage < 100
-SSR is enabled during super-resolution shots since Ansel is now better at hiding any ensuing artifacts
-Postprocess settings are frozen at session start to avoid discontinuities during photography, i.e. wandering between postprocess volumes when the camera auto-moves for stereo and 360 shots.
#jira UE-54244
#4426
Change 3879086 by Krzysztof.Narkowicz
Fixed sky/reflection capture (without owner) update - they are now updated only with a correspoding world
Change 3879090 by Guillaume.Abadie
Fixes tones of regressions on diaphragm DOF's recombine passes.
Change 3879198 by Rolando.Caloca
DR - vk - Support for real uniform buffers on Android platforms
Change 3879993 by Krzysztof.Narkowicz
-Fixed int64->int32 FArchive offset truncation in TShaderMap, VertexFactory and TextureDerivedData
-Fixed FSerializationHistory bug, when trying to serialize 0 bytes
#jira UE-43203
Change 3881462 by Guillaume.Abadie
Implements full res DOF's setup pass for cheaper full res gathering in recombine pass.
Change 3881524 by Krzysztof.Narkowicz
Fixed compilation by removing FTickableEditorObject from FPreviewScene
Change 3881724 by Chris.Bunner
Static analysis fix.
#jira UE-54762
Change 3881861 by Rolando.Caloca
DR - vk - Fix layout warning when generating mip chain
Change 3881864 by Rolando.Caloca
DR - Use render passes on HZB
Change 3882236 by Yuriy.ODonnell
IndirectLightingColorScale is now applied to SubsurfaceLighting and DiffuseLighting. Was previously only applied to DiffuseLighting.
#jira UE-42534
#github 3326
Change 3882325 by Guillaume.Abadie
Implements FocusOnly lower gathering pass for Diaphragm DOF's slight out focus temporal stability.
Change 3882340 by Rolando.Caloca
DR - vk - Fix api dump
Change 3882430 by Rolando.Caloca
DR - vk - KHR_maintenance2
Change 3882563 by Rolando.Caloca
DR - Add depth-stencil access mode to PSO initializer
Change 3882929 by Rolando.Caloca
DR - vk - Proper fix for maintenance extension macros
Change 3883087 by Mark.Satterthwaite
Allow disabling VSync in windowed mode for macOS 10.13.4+ and above.
Change 3883597 by Guillaume.Abadie
Collapses full and half res DOF setup passes together.
Change 3883702 by Guillaume.Abadie
Fixes mac's build.
Change 3884747 by Uriel.Doyon
Fix for static analysis warning
Change 3884975 by Rolando.Caloca
DR - vk - Move some platform defines to platform properties
Change 3884988 by Rolando.Caloca
DR - vk - Make an override per platform
Change 3885832 by Rolando.Caloca
DR - vk - Cosmetic change to group similar members
Change 3885891 by Rolando.Caloca
DR - vk - Some _RenderThread functions to avoid stalls
Change 3886044 by Rolando.Caloca
DR - Added RHI api _RenderThread version of
RHICreateTextureReference
RHICreateShaderLibrary
RHICreateRenderQuery
Change 3886560 by Guillaume.Abadie
Fixes strong aliasing on TAAU's fast shader permutation.
This adds a 6th neighbor sampling, and switch AA_TONE ON as TAA does for its fast shader permutation.
Change 3886749 by Guillaume.Abadie
Cherry-pick 3884748: Implements DOF's BuildBokehLUT for diaphragm blades simulation.
Only used in hybrid scattering for now.
Change 3886750 by Guillaume.Abadie
Cherry-pick 3885457: Simulates diaphragm blades' curvature on bokeh.
Change 3886752 by Rolando.Caloca
DR - Fix metal static analysis
Change 3887460 by Uriel.Doyon
Fixed to more static analysis warning.
Change 3888201 by Rolando.Caloca
DR - vk - Added r.Vulkan.SubmitAfterEveryEndRenderPass
- Fixed bad layout on rendering back buffer
Change 3888209 by Rolando.Caloca
DR - vk - Unity compile fix
Change 3888254 by Rolando.Caloca
DR - vk - Fix async texture layout
Change 3888893 by Guillaume.Abadie
Simulates bokeh in DOF's slight out of focus.
Change 3889085 by Guillaume.Abadie
Fixes DOF's reduce pass sampling outside viewport.
Change 3889924 by Rolando.Caloca
DR - vk - Skip seemingly bad validation error
Change 3890573 by Daniel.Wright
Only initialize FDiaphragmDOFGlobalResource in Feature Level 5
Change 3890590 by Arne.Schober
DR - Fix Paper2d crash. When addMesh is called the Vertex and Indexbuffers are nulled out. re-create Dynamic Mesh builder for every Mesh instead.
#jira UE-55063
Change 3890638 by Arne.Schober
DR - Better fix for Paper2d which honors batching
#jira UE-55063
Change 3891099 by Krzysztof.Narkowicz
1.5 texel shadow offset fix inside Manual2x2PCF based on #4485 GitHub pull request
#jira UE-54985
#4485
Change 3891234 by Krzysztof.Narkowicz
Optimized PCF2x2 and PCF3x3 - merged #4494 GithHub pull request
#jira UE-55121
Change 3891407 by Rolando.Caloca
DR - vk - Set vendor id earlier
Change 3891417 by Rolando.Caloca
DR - vk - Missing layout transitions
Change 3891718 by Arne.Schober
DR - Do not recreate one Frame Resource for dynamic draws
#jira UE-55063
Change 3891925 by Yuriy.ODonnell
Fix/workaround for inconsistent preprocessor definitions for NVAftermath that result in FD3D11DynamicRHI class layout mismatch. NVAftermath support is now enabled by default for Win64.
NVAftermath is declared as a private dependency in D3D11RHI. It does not automatically propagate to modules that explicitly include private RHI headers (OculusHMD, OSVR, OSVRInput). This results in NV_AFTERMATH being defined while compiling RHI module and not defined when compiling other modules, causing memory corruption at runtime.
The long-term solution for this and similar issues requires some mechanism for adding transitive module dependencies, so that anyone that depends on D3D11RHI module would automatically also get the NVAftermath. Additionally, private headers should *never* be included directly by external modules.
The short-term solution is to explicitly add NVAftermath dependency to OculusHMD, OSVR and OSVRInput.
Additionally, NV_AFTERMATH is no longer forced by D3D11RHIPrivate.h when it's not defined. This allows catching this kind of mismatch in the future through a compiler warning (C4668).
#jira UE-53065
Change 3891987 by Rolando.Caloca
DR - vk - Support for dedicated allocations
Change 3892339 by Jian.Ru
Fix a crash when tessellation shaders are used in dx12
#jira UE-55127
Change 3892528 by Rolando.Caloca
DR - vk - Update Linux headers
Change 3892867 by Rolando.Caloca
DR - vk - Don't create swapchain if not needed
Change 3893416 by Guillaume.Abadie
Implements bokeh simmulation on foreground and background gather.
Change 3893732 by Chris.Bunner
GetRelevance_Internal should use the immediate parent resource, not the base, as some features are overridden by permutations e.g. UsesWorldPositionOffset.
#jira UE-53404
Change 3893868 by Guillaume.Abadie
Allocates diaphragm DOF's buffers and structered buffer only on supported platforms.
Change 3893917 by Chris.Bunner
Potential fix for CIS.
Change 3893933 by Chris.Bunner
Duplicating CL 2647737 as this is the same issue from that JIRA where accessing game-thread data was being prevented. We don't have this check in UMaterial::GetMaterialResource already, but presumably the UMaterialInstance case was never removed as we've not been calling it until now.
Change 3894218 by Rolando.Caloca
DR - vk - Remove stat counters per draw call, gains 10% CPU on Infiltrator
Change 3894579 by Arne.Schober
RT - Fix assert not in RenderingThread from Triangle Renderer.
#jira UE-55247
Change 3894724 by Rolando.Caloca
DR - vk - New API for batching barriers
Change 3894909 by Arne.Schober
DR - Fix crash in Speedtree wind where Renderdata is unavailable
#jira UE-54544
Change 3895414 by Rolando.Caloca
DR - Add a configurable threshold for SCWs time outs
Change 3896429 by Marcus.Wassmer
Allow variable frame-latency delay in FrameGrabber frames. For performance you want at least a 1 frame delay so you don't sync the GPU to the CPU.
Change 3896495 by Marcus.Wassmer
Set pointer properly
Fix CIS
Change 3897253 by Guillaume.Abadie
Fixes CIS warning in diaphragm DOF
Change 3899179 by Guillaume.Abadie
Implements background hybrid scatter occlusion for diaphragm DOF.
Change 3903654 by Rolando.Caloca
DR - vk - Rework dump layer to allow other layers
Change 3903766 by Rolando.Caloca
DR - vk - More wrappers
Change 3904025 by Rolando.Caloca
DR - vk - More wrappers
Change 3904342 by Rolando.Caloca
DR - vk - Track image resources & callstacks
Change 3904346 by Rolando.Caloca
DR - vk - Copy fix from 4.19 for flickering grass
Change 3904510 by Rolando.Caloca
DR - vk - Compile fix
Change 3904914 by Daniel.Wright
[Integrate] Fixed PS4 transitions with forward shading
Change 3904916 by Daniel.Wright
[Integrate] Fixed PS4 transitions with occlusion queries
Change 3905975 by Rolando.Caloca
DR - vk - Missing wrappers
Change 3905977 by Rolando.Caloca
DR - vk - Missed file
Change 3907829 by Rolando.Caloca
DR - Move depth bounds to the PSO
Change 3907832 by Rolando.Caloca
DR - vk - Prep for delaying transitions
Change 3907834 by Rolando.Caloca
DR - vk - Fix for depth stencil issues/validation errors
Change 3907967 by Rolando.Caloca
DR - vk - Linux compile
Change 3908093 by Rolando.Caloca
DR - vk - Fix depthstencil layout on descriptors
Change 3908393 by Rolando.Caloca
DR - vk - Disable dedicated allocation as it causes crashes on Nvidia 700 series
Change 3908401 by Rolando.Caloca
DR - Do transitions outside render pass
Change 3908422 by Rolando.Caloca
DR - vk - Fix transition state not getting stored
Change 3908735 by Guillaume.Abadie
Cherry-pick 3896619: Fixes after TAAU post process material that had wrong default buffer UV.
#jira UE-55317
Change 3908736 by Guillaume.Abadie
Cherry-pick 3891352: Fixes ensure when visualizing HDR with TAAU.
#jira UE-55019
Change 3908753 by Guillaume.Abadie
Lets the renderer layout the views in the internal render targets like it prefers.
Change 3909119 by Daniel.Wright
Fix some static analysis warnings
Change 3911943 by Rolando.Caloca
DR - vk - Fix for packaging Vulkan projects
Change 3912145 by Rolando.Caloca
DR - vk - Fix layout on streaming textures
Change 3913029 by Rolando.Caloca
DR - Fix missing transition
Change 3913048 by Rolando.Caloca
DR - Fix for hlslcc
Change 3913054 by Rolando.Caloca
DR - vk - Fix number of layers on barrier
Change 3913171 by Rolando.Caloca
DR - vk - Fix for decal missing transition
Change 3913211 by Rolando.Caloca
DR - vk - Add debug name to image tracking
Change 3913449 by Rolando.Caloca
DR - vk - Restore transition
Change 3913466 by Rolando.Caloca
DR - Fix Vulkan EngineTest
Change 3913537 by Rolando.Caloca
DR - vk - Fixes independent samplers & textures (contributed by AMD)
Change 3913548 by Rolando.Caloca
DR - vk - Warning fix
Change 3913691 by Rolando.Caloca
DR - vk - Fixes for parallel (wip)
Change 3914656 by Rolando.Caloca
DR - vk - Fix bug when using separate samplerstates and textures
Change 3914730 by Rolando.Caloca
DR - vk - Bump version
Change 3914764 by Rolando.Caloca
DR - vk - Don't crash on exit
Change 3915532 by Rolando.Caloca
DR - vk - Parallel context fixes
Change 3915589 by Rolando.Caloca
DR - vk - Hoist and rename transition and layout manager class out of the context
Change 3915592 by Rolando.Caloca
DR - Fix gpu marker name
Change 3917607 by Rolando.Caloca
DR - vk - Fix depth bounds on Vulkan
Change 3917609 by Rolando.Caloca
DR - vk - Fix static analysis
Change 3917616 by Rolando.Caloca
DR - Fix D3D11 initialization
Change 3920569 by Rolando.Caloca
DR - vk - Prep for layout mgr refactor
Change 3921023 by Rolando.Caloca
DR - vk - Dump layer fixes
Change 3921623 by Rolando.Caloca
DR - vk - Prep refactor for layouts
- Dump now shows marker tree
Change 3922007 by Rolando.Caloca
DR - vk - Fix extra allocation per draw call
Change 3922442 by Rolando.Caloca
DR - vk - Detect potential issues
Change 3922470 by Rolando.Caloca
DR - vk - Minor optimization
Change 3922482 by Rolando.Caloca
DR - vk - More minor optimizations
Change 3923158 by Rolando.Caloca
DR - Move r.DisableEngineAndAppRegistration out to common RHI and use it on Vulkan
Change 3923486 by Rolando.Caloca
DR - vk - Minor cpu optimizations
Change 3923505 by Rolando.Caloca
DR - vk - Use bigger allocations for uniform buffers
Change 3923516 by Rolando.Caloca
DR - vk - Android compile fix
Change 3923557 by Rolando.Caloca
DR - vk - Cache descriptorset layouts, refactor duplicated code
Change 3923851 by Rolando.Caloca
DR - vk - Linux compile fix
Change 3924153 by Rolando.Caloca
DR - vk - Support for dynamic UBs
Change 3924193 by Rolando.Caloca
DR - vk - Remove old per pso descriptor pools
Change 3924197 by Rolando.Caloca
DR - vk - Remove unused global uniform buffer pool
Change 3924220 by Rolando.Caloca
DR - vk - Wrap some unused classes in their define
Change 3924234 by Rolando.Caloca
DR - vk - Show ring buffer wrapping messages
Change 3924243 by Rolando.Caloca
DR - vk - Fix bad dynamic buffer
Change 3924902 by Rolando.Caloca
DR - vk - Fix crash running infiltrator
Change 3925209 by Rolando.Caloca
DR - vk - Fix bug with dynamic buffers
- Remove old defines
Change 3925300 by Rolando.Caloca
DR - vk - Allow packed uniforms as dynamic UBs (with r.Vulkan.DynamicGlobalUBs)
Change 3925627 by Rolando.Caloca
DR - vk - Move DynamicOffsets into the pipeline state
Change 3925834 by Rolando.Caloca
DR - vk - Cache per stage information
Change 3925835 by Daniel.Wright
Fixed DisplayName for UParticleModuleCollisionGPU
Change 3925897 by Rolando.Caloca
DR - vk - Split update descriptors loop
Change 3926488 by Rolando.Caloca
DR - vk - 16MB for ring buffer on desktop, 8 MB for mobile
Change 3928168 by Guillaume.Abadie
Cherry-pick 3917219: Implements r.DOF.RecombineQuality
Change 3928173 by Guillaume.Abadie
Cherry-pick 3927888: Enables r.DOF.HybridScatter.BackgroundCompositing and r.DOF.HybridScatter.ForegroundCompositing to work when both enabled.
Change 3928216 by Rolando.Caloca
DR - vk - Fix Android
- Fix static analysis
Change 3929119 by Rolando.Caloca
DR - vk - Rename some classes for clarity
- Fix read-only cvar
Change 3929151 by Rolando.Caloca
DR - vk - Rename class
Change 3930046 by Rolando.Caloca
DR - Temp fix Vulkan flickering grass
Change 3930148 by Rolando.Caloca
DR - vk - Only update dirty descriptors
- Use dynamic descriptors for packed global uniform buffers
Change 3930998 by Guillaume.Abadie
Packs shader permutation in different XGE submissions.
Change 3931079 by Rolando.Caloca
DR - vk - Fixes for Android and non-real ubs platforms
Change 3931942 by Krzysztof.Narkowicz
Depth rendering - When EarlyZPassMode is set to DDM_AllOccluders, dynamic objects need also to test bUseAsOccluder just like static ones
#jira none
Change 3932819 by Daniel.Wright
[Integrate] Scene Textures uniform buffer
* Base Pass Uniform Buffer now contains a Scene Textures uniform buffer. Previously the translucent base pass had to check ~40 loose scene texture parameters every draw.
* FMeshMaterialShader's must now bind PassUniformBuffer and supply a valid pass uniform buffer. For most passes this is just FSceneTextureUniformParameters.
* FRendererModule::DrawTileMesh can now cleanly set dummy scene texture resources, just by configuring how the pass uniform buffer is created.
* Moved scene texture shader functions out of Common, into SceneTexturesCommon which must be manually included by shaders that want to use them
* Separate Mobile Scene Textures uniform buffer to silo the platform complexities
Moved DBuffer inputs out of FDeferredPixelShaderParameters and into FOpaqueBasePassUniformParameters
Removed per-frame material uniform expressions. GameTime material node with period is now implemented with an fmod in the shader, without the use of MaterialFloat, so that it will happen at full precision.
* Per-frame expressions were used when the GameTime material node had a period, to do the fmod on the CPU where 32 bit precision is guaranteed, for mobile GPU's where pixel shader precision is sometimes less than 32fp.
Moved forward shading data into the Base Pass Uniform Buffer
Removed instanced stereo support for the light cull grid - will have to be reimplemented without changing SRV's per draw
Base pass sets View Uniform Buffer from DrawRenderState instead of choosing which one to set per-draw
Fixed padding in nested uniform buffer structs
Skip SRV members on Feature Level SM4 and below
Change 3932964 by Rolando.Caloca
DR - vk - Renderdoc on Android
Change 3933095 by Daniel.Wright
Moved FSceneTextureUniformParameters out of the opaque base pass uniform buffer.
* Base Pass shaders now enable SCENE_TEXTURES_DISABLED when compiling for a material of any domain other than MD_Surface. These are used when rendering thumbnails of a material in a different domain, which could be opaque, but the opaque base pass drawing policy does not bind a scene textures uniform buffer, so the shader must not bind it.
* Opaque materials can no longer use EyeAdaptation.
Change 3933096 by Daniel.Wright
Better d3d11 assert message when a uniform buffer was not set by the renderer
Change 3933176 by Rolando.Caloca
DR - vk - Prefer mailbox if available
Change 3933271 by Ryan.Vance
#jira UE-55936
Fixed missing referenced uniform bindings on AR pass-through camera shaders.
Change 3934000 by Guillaume.Abadie
Fixes Win32 build in ShaderCompilerXGE.cpp
Change 3934299 by Guillaume.Abadie
Fixes a bug in DOF's reduce operator that was casusing color leaking between background and foreground.
Change 3934699 by Daniel.Wright
Added bAffectDistanceFieldLighting to landscape
Change 3935190 by Daniel.Wright
Forward Light Grid SRV's use StructuredBuffer on Metal, instead of 'invariant Buffer', which throws off RemoveUniformBuffersFromSource parsing
Change 3935606 by Daniel.Wright
Removed LightmapPolicy::Set which was needed for vertex lightmaps
Renamed FVertexFactory::Set to SetStreams to make it findable
Change 3936510 by Rolando.Caloca
DR - vk - Update glslangValidator.exe to 1.0.65.1 for dumped debug SPIRV shaders
Change 3936545 by Richard.Wallis
Clone of CL's (3925763, 3925430, 3925424, 3925385, 3925278) Mark Satt's Xcode fixes from task stream //Tasks/UE4/Dev-UERNDR-354-mtlpp/
Plus XCode 9.2 compile fix in ApplicationPlatformCompilerPreSetup.h for -Wunused-lambda-capture.
Change 3938061 by Daniel.Wright
Vulkan: Added support for SRV's in Uniform Buffers
Change 3938123 by Daniel.Wright
Vulkan: Slightly better assert for null resources in uniform buffer
Change 3939197 by Rolando.Caloca
DR - vk - Disable custom memory mgmt
Change 3939677 by Rolando.Caloca
DR - vk - Fix static analysis warning
Change 3939809 by Rolando.Caloca
DR - vk - Fixes for async compute
Change 3939875 by Rolando.Caloca
DR - vk - Support for -vktrace
Change 3939977 by Rolando.Caloca
DR - vk - Skip a condition during gather UBs
- Set up efficient compute async var
- Fix validation cmd line
Change 3939982 by Rolando.Caloca
DR - vk - Revert mipchain
Change 3939984 by Rolando.Caloca
DR - vk - Remove unnecessary asserts
Change 3940082 by Rolando.Caloca
DR - vk - Custom mem mgr
Change 3940475 by Rolando.Caloca
DR - vk - Fix DFAO (indirect draw offset)
Change 3940555 by Rolando.Caloca
DR - vk - Minor fixes
Change 3940675 by Rolando.Caloca
DR - vk - Fix indirect type mismatch
Change 3941111 by Rolando.Caloca
DR - Renderpass bGeneratingMips
Change 3941847 by Daniel.Wright
Fixed Volumetric Lightmaps on Static geometry only working if the geometry had been built with Surface Lightmaps before
Change 3941978 by Rolando.Caloca
DR - vk - Minor fixes for presenting on compute queue
Change 3942074 by Rolando.Caloca
DR - vk - Remove some RHI stalls
- Fixed swap chain stat
Change 3943946 by Daniel.Wright
Fixed Texcoord0 on Volume materials on a particle sprite, including SubUV particles.
Change 3944065 by Daniel.Wright
Fixed SceneDepth collision getting broken on GPU particles when a scene capture is rendering
Change 3944158 by Daniel.Wright
Fixed ViewUniformShaderParameters accessing GEngine->PreIntegratedSkinBRDFTexture too early during slate loading screen
Change 3944865 by Rolando.Caloca
DR - vk - Prep for render passes
Change 3945196 by Rolando.Caloca
DR - Move render pass validate to cpp
Change 3945202 by Rolando.Caloca
DR - vk - Some fixes for using real render passes
Change 3945357 by Rolando.Caloca
DR - Fix bad condition
Change 3946295 by Yuriy.ODonnell
Added a sentinel member to FLightMap, which is initialized in the ctor and reset in the dtor. Sentinel is then checked in FLightCacheInterface::GetLightMapInteraction().
This aims to shed some more light on a hard-to-repro crash, which is suspected to be a use-after-free bug: http://crashreporter/Buggs/Show/1785593
Change 3946407 by Rolando.Caloca
DR - vk - Prep for refactor
Change 3946648 by Rolando.Caloca
DR - vk - Fixes for async compute (wip)
Change 3947299 by Rolando.Caloca
DR - vk - FIx static analysis
Change 3948434 by Rolando.Caloca
DR - vk - Fix exiting with parallel
Change 3948928 by Rolando.Caloca
DR - vk - Fix enabling draw markers for tools
Change 3949021 by Rolando.Caloca
DR - vk - Buffer tracking layer
Change 3949602 by Rolando.Caloca
DR - vk - static analysis fix
Change 3949757 by Rolando.Caloca
DR - vk - Remove bogus parameter
Change 3949810 by Rolando.Caloca
DR - vk - Move waits for cmd buffer
Change 3950270 by Guillaume.Abadie
Implements dedicated gather pass for foreground hole filling to avoid being VGPR bound in foreground gather pass, but still being hable to amend foreground.
Change 3950272 by Rolando.Caloca
DR - vk - Minor refactor for semaphores
Change 3950279 by Guillaume.Abadie
Oups... fixes build
Change 3950298 by Rolando.Caloca
DR - vk - Gather wait semaphores in the cmd buffers
Change 3950371 by Rolando.Caloca
DR - vk - fixes for async compute
Change 3950597 by Rolando.Caloca
DR - vk - Fix for clip distance (fixes planar reflections)
Change 3951075 by Rolando.Caloca
DR - vk - Fix for async compute
Change 3952524 by Guillaume.Abadie
Some DOF enum refactoring.
Change 3955016 by Daniel.Wright
Fixed BuiltData package getting renamed into the map package during a content browser folder move, causing a redirector to be incorrectly placed in the map package
Change 3955668 by Guillaume.Abadie
Fixes a bug where full res coc buffer was computed even if not doing slight out of focus.
Change 3956722 by Guillaume.Abadie
Fixes a bug where r.DOF.MaximalForegroundBlurringRadius was screen percentage dependent.
Change 3959212 by Guillaume.Abadie
Prefixes all DOF's shaders files with DOF keyword.
Change 3959705 by Guillaume.Abadie
Optimises the DOF setup pass outputing half res and full res with LDS downsample.
Change 3959941 by Guillaume.Abadie
Halfs DOF's hybrid scatter compilation by using a unique downsampling for both foreground and background, instead of 2 reduce passes.
Change 3962273 by Rolando.Caloca
DR - Fix typos
#jira UE-56317
PR #4586
Change 3962615 by Rolando.Caloca
DR - vk - Compile fix
Change 3962949 by Rolando.Caloca
DR - Fix DOFDownsample extension
Change 3962993 by Guillaume.Abadie
Back out changelist 3962949
Change 3963016 by Guillaume.Abadie
Adds missing DOFDownsample.usf
Change 3963041 by Rolando.Caloca
DR - vk - Misc changes to help integrate
Change 3964293 by Guillaume.Abadie
Fixes DOF's setup pass reading outside of the viewport.
Change 3964475 by Guillaume.Abadie
Collapses DOF's hybrid scatter compilation passes into reduce passes.
Change 3964883 by Daniel.Wright
Fixed 3d texture in uniform buffer on unsupporting RHI
Change 3964897 by Rolando.Caloca
DR - Compile fixes
Change 3964914 by Guillaume.Abadie
Fixes a bug on r.DOF.RecombineQuality=0
Change 3965153 by Guillaume.Abadie
Fixes compile warning in D3D12Commands.cpp.
Change 3965814 by Rolando.Caloca
DR - Prep for integration conflict resolve
Change 3965899 by Rolando.Caloca
DR - Fix odd linkage issue
Change 3966072 by Rolando.Caloca
DR - More prep for merge
Change 3966163 by Rolando.Caloca
DR - Merge prep
Change 3966844 by Guillaume.Abadie
Packs multiple DOF scattered bokeh per instance and uses PT_RectList in DOF for platforms that can.
Change 3967116 by Rolando.Caloca
DR - Compile fixes for integration
Change 3967273 by Rolando.Caloca
DR - Use same path for mip generation
Change 3967277 by Rolando.Caloca
DR - vk - Fix mips on cubemaps
Change 3967693 by Rolando.Caloca
DR - Copying //UE4/Dev-Main@3912313 to //UE4-DevRendering, missing shaders
Change 3967851 by Rolando.Caloca
DR - Copying //UE4/Dev-Main@3912313 to //UE4-DevRendering, Engine 2/2
Change 3968083 by Rolando.Caloca
DR - Integration compile fixes
Change 3968240 by Rolando.Caloca
DR - Shader compile fixes for integration
Change 3968270 by Rolando.Caloca
DR - Fix for missing hash calculation
Change 3969426 by Rolando.Caloca
DR - vk - Fix warning
Change 3969869 by Krzysztof.Narkowicz
Back out changelist 3946295 - UE-54537 is fixed, so no need for this debug sentinel.
#jira none
Change 3969944 by Rolando.Caloca
DR - Warning fix
Change 3970020 by Rolando.Caloca
DR - Bump after integration
Change 3970052 by Rolando.Caloca
DR - Fix for mobile
Change 3970236 by Daniel.Wright
Causing decal shader to recompile to fix a merge bug
Change 3970270 by Daniel.Wright
Bump shader version from merge
Change 3970339 by Olaf.Piesche
Replace series of locks/unlocks with a single one for curve injection
#tests QAGame
Change 3970390 by Rolando.Caloca
DR - Rename FSceneTextureUniformParameters to FSceneTexturesUniformParameters
- Remove duplicate method for occlusion queries
Change 3970523 by Rolando.Caloca
DR - Fix serialization of shaders
Change 3970533 by Arne.Schober
DR - fix for removing the Speed tree wind when the scene gets deleted. The original enque rendercommand requeues the element onto the renderthread although the call already came from the Renderthread and the scene can get lost in between.
#jira UE-56322
Change 3971160 by Guillaume.Abadie
Fixes CompositeEditorPrimtive pass and SelectionOutline pass for VR editor to work with TAAU.
Change 3971516 by Guillaume.Abadie
Cherry-pick 3912629: Fixes SSR that was computing vigneting according to PrevScreen that could let some outside viewport samples going through when rotating the camera.
#jira UE-55353
Change 3971594 by Krzysztof.Narkowicz
Fixed assert inside BindLightMapVertexBuffer. FSplineMeshSceneProxy was calling BindLightMapVertexBuffer for invalid (still not generated) lightmap UV channel after mesh reimport. Simplified assert, as at the moment almost all of the high callsites already clamp lightmap uv channel.
#jira UE-56321
Change 3971622 by Krzysztof.Narkowicz
Fixed crash inside Indirect Lighting Cache. Data (reflection captures and lightmap) generation calls ULevel::GetOrCreateMapBuildData(), which can destroy lightmap data if level has legacy data. Last Lightmap generation step recreates this data, but if user cancels lightmap generation - it won't do that.
#jira UE-56171
Change 3974788 by Rolando.Caloca
DR - Remove GSupportsGenerateMips
Change 3974789 by Rolando.Caloca
DR - Remove bogus function
Change 3974986 by Rolando.Caloca
DR - vk - Tracking fixes
Change 3974989 by Rolando.Caloca
DR - vk - Don't submit dummy barriers
Change 3975075 by Olaf.Piesche
Update for particle curve injection improvement, fixing ES2 problems
#tests QAGame tm-shadermodels, various color curve tests in-editor
Change 3975957 by Uriel.Doyon
Fixed invalid max texture resolution when using the bake material tools.
Change 3978471 by Daniel.Wright
New cvar r.SkylightUpdateEveryFrame
Change 3978779 by Rolando.Caloca
DR - Accessor for texture sizes
Change 3978797 by Rolando.Caloca
DR - Clean up RHI CopyTexture API
Change 3978832 by Rolando.Caloca
DR - vk - Workaround for RenderDoc crashing due to Descriptor Pool reset
Change 3978836 by Rolando.Caloca
DR - vk - Remove generate mips
Change 3979201 by Rolando.Caloca
DR - vk - RHI CopyTexture. Uses general layout for generating mips
Change 3979204 by Rolando.Caloca
DR - Use render passes and CopyTexture to generate mips
Change 3979592 by Rolando.Caloca
DR - Warning fix
Change 3980855 by Krzysztof.Narkowicz
Optimize bounding sphere radius after non-uniform scale by using bounding box extent.
#jira UE-56227
Change 3981065 by Rolando.Caloca
DR - vk - Fix bad layout
#jira UE-56238
Change 3981346 by Rolando.Caloca
DR - Copy from 3707257
Support for not flushing compute jobs (r.D3D11.UAVFlushNV)
Change 3981347 by Rolando.Caloca
DR - Copy from 3707257
Don't flush between morph dispatched
Change 3981932 by Mark.Satterthwaite
Generate the shader hash and function name when a Metal shader error needs to be reported so that even without shader code we get something to go on.
Change 3982442 by Rolando.Caloca
DR - Fix warning
Change 3982652 by Rolando.Caloca
DR - vk - Signal semaphore cleanup
Change 3983917 by Richard.Wallis
Clone of CL 3974146 converted for mtlpp along with extra mtlpp usage suggestions by Mark Satt:
Fix for black flickering on first paint with weighted material landscape on Mac. When using AsyncCopyFromBufferToTexture in Metal we put the blit operation on the prologue encoder - however after a draw call using that resource the copy operation should happen after on the current encoder, this keeps the correct order of operations.
Added Bool return from various Asnyc renderpass resource requests so caller can decide correct further action. Updated to include the other async functions.
Change 3984409 by Guillaume.Abadie
Attempts to make static analysis happy again.
Change 3984435 by Nick.Bullard
Checking in Performance Test level provided to us by Tor Frick based on UE-44841.
This has been utilized for checking issues against Aftermath performance impact.
The Map includes 2 Level Book marks, most testing has been done against Bookmark 1 view, in fullscreen, in game mode
Change 3985087 by Mark.Satterthwaite
Make sure that the particle scratch buffer is large enough to hold all the data for the curve texture we are rendering to, otherwise a full set of curves will start scribbling memory after 64Kb (the curve texture is 256Kb of data - 512x512x4 as sizeof(RGBAUInt8) == 4). This happens in ElementalDemo.
Change 3985201 by Rolando.Caloca
DR - Fix bad CopyTexture
Change 3985258 by Mark.Satterthwaite
Try and detect orientation changes so that we don't blow-up on iOS due to a huge mismatch between the drawable texture for the display and the scene's depth-stencil target. I can't just fiddle with the depth-stencil texture itself without running the risk of obliterating in-use data and really we shouldn't permit such a mismatch anyway but it is fallout from 3620990.
#jira UE-55756
Change 3986449 by Rolando.Caloca
DR - vk - Update & consolidate Vulkan headers to 1.1.70.1
Consolidate SDK into one
Change 3986571 by Guillaume.Abadie
Makes PVS-Studio happy again in DOF.
Change 3987039 by Yuriy.ODonnell
Initial implementation of tracing profiler to show CPU and multiple GPUs on the same timeline. Currently only supported on DX12 platforms.
Use `TracingProfiler frames=N` console command to trigger a capture of the next N frames. Trace is saved to disk as a JSON file into `Saved/Profiling/Traces` directory.
Trace file uses Google Tracing format and can be visualized in Chrome built-in profiler (chrome://tracing).
`r.GPUStatsChildTimesIncluded=1` CVar makes timing scopes hierarchical.
`TracingProfiler.BufferSize=N` CVar controls the size of the tracing buffer, which may need to be increased for long traces (default is 65k events). Only can be set at startup.
Change 3987074 by Yuriy.ODonnell
Implemented timestamp calibration on DX11. Calibration is only performed when tracing profiler session starts.
Change 3987160 by Yuriy.ODonnell
Added thread naming and ordering to the tracing profiler output
Change 3987331 by Mark.Satterthwaite
Remove the Nvidia hack to retain resource references in command-buffers for UE-46604 as the mtlpp refactor provides stronger resource lifetime guarantees.
#jira UE-46604
Change 3987754 by Mark.Satterthwaite
Fix MetalRHI memory reporting in non-default path.
PR #4568
Change 3988184 by Arciel.Rekman
Linux: Fix editor OpenGL performance (UE-55960).
- GetCurrentThreadId() calls became much more frequent with the OpenGL RHIT refactor.
- We used to only cache that value in monolithic builds, because having per-thread static variables in dynamic libraries is risky due to OS limits.
- This change adds dynamically-managed per-thread cache for non-monolithic builds.
#jira UE-55960
Change 3988394 by Rolando.Caloca
DR - vk - Improve memory mgmt
- Use 256MB pages for Device heap (or 1/8th if less).
- Remove texture allocations not going through resource manager
Change 3988405 by Marcin.Undak
Fix VulkanQuery crash on exit #codereview rolando.caloca #codereview arciel.rekman #rb arciel.rekman
Change 3988567 by Rolando.Caloca
DR - vk - Support for packed global UBs on pci aperture heap
Change 3988668 by Rolando.Caloca
DR - vk - Remove old comments
Change 3988956 by Marcin.Undak
RecordPerformance: added option to skip building/cooking before tests #rb none #codereview arciel.rekman
Change 3989161 by Yuriy.ODonnell
Static analysis error fix
Change 3989196 by Guillaume.Abadie
Fixes a crash in light shaft's TAA pass.
#jira UE-57366
Change 3989207 by Yuriy.ODonnell
Refactored FRealtimeGPUProfilerFrame to avoid splitting profile events when calculating exclusive times of scopes. This allows tracing profiler to retain the hierarchical view of the data, while keeping CSV and GPU Stat system behavior intact.
Change 3989469 by Rolando.Caloca
DR - vk - Fix for bad index; fix for bad transition
Change 3989772 by Yuriy.ODonnell
Implemented timestamp calibration on Vulkan
Change 3990040 by Marcus.Wassmer
Aftermath enabled by default.
Removed unnecessary warning for other vendors
Change 3990064 by Mark.Satterthwaite
Ensure that packed globals are reuploaded when the command-encoder is restarted - don't simply invalidate the existing parameters. This properly handles cases where a single logical render-pass is broken into multiple command-encoders and/or command-buffers - otherwise all shaders must reset all parameters each time. When we move between frames we *do* want to perform a full state reset though as previous frame globals are treated as invalid.
Change 3990080 by Mark.Satterthwaite
Change the way we invalidate the visibility buffer between command-buffers and command-encoders so that on iOS you can reuse the same buffer within the same command-buffer, but not across more than one. The code provides an exception to this rule when running under the MetalRHI validation tools which can break each draw call into its own buffer.
Change 3990084 by Mark.Satterthwaite
Get MetalStatistics compiling again.
Change 3990381 by Arciel.Rekman
Bring back D3D12 in RecordPerformance.
Change 3991113 by Rolando.Caloca
DR - Fix crash on RHI thread on mobile preview
- Check RHI objects are not null in the PSO initializer
Change 3991191 by Ryan.Vance
#jira UE-55952
Reimplemented instanced stereo for forward lighting cull grid after the srv/ub clean up.
Change 3991343 by Rolando.Caloca
DR - Copy from 3911492
UE4 - Disabled parallel mobile bass pass by default. This is experiemental and not known to be useful on any mobile platform.
Change 3991375 by Mark.Satterthwaite
Proper copyright assignment in the mtlpp debugger header.
Change 3993151 by Daniel.Wright
Fix RTDF resource transition found by Rolando
Change 3993818 by Rolando.Caloca
DR - Missed file
Change 3993923 by Krzysztof.Narkowicz
Fixed crashes inside RemoveSpeedTreeWind() and RemoveSpeedTreeWind_RenderThread().
FStaticMeshComponentRecreateRenderStateContext didn't flush deferred render updates causing stale RenderData to be left:
1. Thumbnail manager called SetStaticMesh(nullptr), which added StaticMeshComponent to deferred render updates.
2. UStaticMesh::Build called FStaticMeshComponentRecreateRenderStateContext and destroyed DenderData, but didn't touch Thumbnail's manager StaticMeshComponent as it was nullptr.
3. This resulted in a StaticMeshComponent with stale RenderData pointer.
#jira UE-54544
Change 3994033 by Rolando.Caloca
DR - vk - Reworked layers & extensions, as we were not doing it properly
- Remove -vulkanstandardvalidation and -novulkanstandardvalidation as they are not needed anymore
Change 3994275 by Mark.Satterthwaite
Change to linking against mtlpp via AddEngineThirdPartyPrivateStaticDependencies and marking its header with THIRD_PARTY_* macros in the vain hope that might convince the remote compilation code to distribute the module to the remote machine when building MetalRHI.
#jira UE-57507
Change 3994365 by Mark.Satterthwaite
Pilfer some code from the old MetalHeap file to handle calculating texture memory size on older macOS and iOS builds when running with stats or LLM enabled.
#jira UE-57513
Change 3994382 by Rolando.Caloca
DR - vk - Some missing locks during image tracking
Change 3994422 by Rolando.Caloca
DR - vk - Remove bogus shader format
Change 3995530 by Rolando.Caloca
DR - vk - Fix for crash when validation is enabled
Change 3995531 by Rolando.Caloca
DR - vk - Fix static analysis
Change 3995532 by Rolando.Caloca
DR - vk - Added support for r.Vulkan.SaveValidationCache
Change 3995610 by Uriel.Doyon
Texture Streaming Changes and Fixes:
- Using the small FOV items (like scopes) now only affect visible primitives (through "r.Streaming.MaxHiddenPrimitiveViewBoost").
- Static components added after the level is registered in the streaming manager are now handled correctly (fixes the low quality on the chests)
- Dynamic components do not need to register to the streaming manager anymore.
- Optimized dynamic component management by removing duplicate entries in the update list.
- Added a pregarbage collect pass to the dynamic component management to optimize GC handling.
- Added a budget reset logic whenever the scene requirements change significantly.
- PIE worlds now have correct visibility information.
- Fixed possible invalid memory access when processing the streaming manager slave views.
- Refactored the incremental level texture data build to prevent new components from being unhandled.
- Removed StreamingManager callbacks for NotifyActorSpawned() and NotifyPrimitiveAttached()
- Added a StreamingManager callback NotifyPrimitiveUpdated(), to be used whenever a primitive streaming state must be updated.
#jira none
Change 3995908 by Arciel.Rekman
Fix compile errors when using new Vulkan queries.
Change 3995990 by Arciel.Rekman
More compile fixes to new Vulkan queries.
- MSVC did not catch this, clang did.
Change 3996101 by Rolando.Caloca
DR - vk - Win32 compile fix
Change 3996323 by Mark.Satterthwaite
Use the right include path to export the mtlpp headers.
#jira UE-57507
Change 3996392 by Arciel.Rekman
Vulkan: fix crash on start when using new queries.
- CommandBufferManager was not yet set at that point and the code in queries relied on it.
Change 3996585 by Rolando.Caloca
DR - Slight improvement to GL being black, but just a temporary 'workaround' as it's not correct.
Change 3998806 by Arciel.Rekman
Fix Linux build (UE-57602).
#jira UE-57602
Change 3998866 by Arciel.Rekman
SubwaySequencer: fix old shader platform name.
Change 3998947 by Mark.Satterthwaite
Silence deprecation warnings in CEF on macOS now that we've moved to 10.12 as the minimum.
#jira UE-57577
Change 3998951 by Mark.Satterthwaite
Fix last of the deprecation errors that I am aware of for macOS 10.12.
#jira UE-57581
Change 3998984 by Mark.Satterthwaite
Build mtlpp for iOS 9.0 not 9.3.
#jira UE-57586
Change 3999065 by Rolando.Caloca
DR - vk - Make sure we use version 1.0.0
#jira UE-57521
Change 3999071 by Arne.Schober
DR - [UE-55433, UE-57361] Hack SNORM support in OpenGL by re-interpreting UNORM. Underlying data is always SNORM.
#jira UE-55433, UE-57361
Change 3999494 by Rolando.Caloca
DR - Enable r.UnbindResourcesBetweenDrawsInDX11 in debug
- Clear compute resources when r.UnbindResourcesBetweenDrawsInDX11 is enabled
Change 4000197 by Krzysztof.Narkowicz
Mesh simplifier - normalize TexCoordWeights using min/max TexCoord range. This fixes precision issues for very big TexCoord values and allows to optimize for all TexCoord channels when channels have values of different magnitudes (e.g. non standard TexCoord data).
#jira UE-54935
Change 4000305 by Yuriy.ODonnell
Suppress PVS Studio warning V547 (Expression is always true) related to Aftermath
Reported issue to PVS team and to NVIDIA. Confirmed false positive, fix coming in future PVS version (v6.24).
#jira UE-57579
Change 4000853 by Arciel.Rekman
Linux: fix not calling CrashReportClient (UE-57678).
#jira UE-57678
Change 4001504 by Rolando.Caloca
DR - vk - Fix transition
Change 4002460 by Krzysztof.Narkowicz
Toggle for contant shadow length in word space
Exposed contact shadows to Blueprints
#jira none
Change 4002608 by Rolando.Caloca
DR - vk - Fix static analysis
- Fix potential debug image tracking crash
- Comment out unused methods
Change 4002615 by Rolando.Caloca
DR - vk - Allow r.Vulkan.WaitForIdleOnSubmit to be set at startup (e.g. in ConsoleVariables.ini)
Previously, if your map needed to UpdateSkyCaptureContents on startup, an ensure would fail if GWaitForIdleOnSubmit was set.
PrepareForCPURead needs to wait for the command buffer to finish before trying to read the results back, but the wait has already happened when r.Vulkan.WaitForIdleOnSubmit is set. Trying to wait again correctly complains that the command buffer is not in the correct state. So, skip the WaitForCmdBuffer call when r.Vulkan.WaitForIdleOnSubmit is set.
Change 4002640 by Rolando.Caloca
DR - vk - Missing support for CVarDefaultBackBufferPixelFormat
Change 4002919 by Guillaume.Abadie
Implements DOF's temporal upsampling pass for better dynamic resolution stability.
Change 4002984 by Guillaume.Abadie
Integrates Sebastian Aaltonen's ALU optimisations for TAAU.
Change 4003112 by Olaf.Piesche
Fir for TBB stall (resulting in severe hitches and hangs in the editor with stats active); tested multiple scenarios and encountered no hitches.
#tests QAGame PerformanceTest and RenderTest map with various stats on and off
Change 4003159 by Mark.Satterthwaite
Undo parts of changelist 3970553 - the ref-counted pointer approach to returning textures to the pool is not working as expected so we'll remove that. It'll be faster on the CPU without it and everything works thanks to the changes this CL made to the way textures were released.
#jira UE-57538
Change 4003287 by zachary.wilson
Adding reflection capture content to TM-LightingScenarios
Change 4003395 by Arne.Schober
DR - Fix unitzialised value when clicking Go To in the editor
#jira UE-57048
Change 4003425 by Rolando.Caloca
DR - vk - Fix for new occlusion queries
Change 4003530 by Arne.Schober
DR - Disable GPU Benchmark in headless configurations
#jira UE-57673
Change 4003717 by Rolando.Caloca
DR - vk - Fix for depth not store, stencil store
Change 4003719 by Rolando.Caloca
DR - Minor switch to render pass
Change 4003720 by Mark.Satterthwaite
Don't suballocate private memory buffers on Vega and only Vega as there is something wrong with the blits in those cases but I can't capture a GPU trace to find out what right now (the driver is broken) - could be a bug in my code but this works on Polaris and Nvidia so it will need to be filed as a radar for AMD.
Remove the FMetalBufferChunk from FMetalBuffer and simply store a pointer to the owning Heap/Magazine allocator. The FMetalResourceHeap now calls a new Release function to return the buffer to the allocator which will be faster on the CPU.
#jira UE-57659
Change 4003854 by Mark.Satterthwaite
Undo parts of 3990064 and try a different approach to get the uniforms to upload and remain available in the right places. As the original bug has been lost to time we should keep an eye out for missing buffer bindings by running under the Metal validation layer periodically.
#jira UE-57576
Change 4004709 by Rolando.Caloca
DR - Support for D3D 11, 12 & Vulkan for UAVs off Index Buffers
Change 4005149 by Guillaume.Abadie
Adds shader permutation to avoid clamping input buffer UV in DOF's gather pass.
Change 4005284 by Uriel.Doyon
Resaved volume texture assets with proper engine version.
#jira UE-57534
Change 4005286 by Guillaume.Abadie
Reduces constant setup in DOF's gather pass.
Change 4005359 by Rolando.Caloca
DR - vk - Fix annoying warning
Change 4005363 by Rolando.Caloca
DR - Fix android not finding vulkan shaders
Change 4005457 by Rolando.Caloca
DR - vk - Fix swapchain crash
Change 4005473 by Patrick.Kelly
UE-57135: Editor crash if set Reflection Capture Resolution to be 64 and New a Default level
Codde by Daniel
Tested by Patrick
Change 4005474 by Rolando.Caloca
DR - vk - Remove glsl code from shaders. Packaged QAGame goes from 176MB to 162MB
Change 4005759 by Krzysztof.Narkowicz
Fixed a bug, where reflection capture build is called, even though we are in mobile preview mode.
#jira UE-57743
Change 4005774 by Mark.Satterthwaite
Update the wave intrinsics to avoid implicit bool->uint conversion that Apple don't like.
#jira UE-57750
Change 4005974 by Mark.Satterthwaite
Don't use cubemap array types on iOS Metal as they aren't available on all devices and we need to maintain backward compatibiliy for years to come.
#jira UE-57083
Change 4006056 by Mark.Satterthwaite
Remove the use of the PrimitiveType argument from Metal draw calls.
#jira UE-57822
Change 4006139 by Mark.Satterthwaite
- Move the render-pass functions into the MetalRHI implementation for later alteration.
- Implement Index buffer UAVs for Metal - makes them more like vertex-buffers so this is one more step on the road to a unified buffer base-class implementation.
Change 4006215 by Mark.Satterthwaite
Metal's begin & end render/compute pass API implementation will take some time, but for now make it not depend on the parent stub implementation.
Change 4006394 by Mark.Satterthwaite
In lieu of a real instruction count just use the number of lines in the "Main" function of the shader as the instruction count for Metal.
#jira UE-57551
Change 4006493 by Mark.Satterthwaite
MetalRHI can currently support 4-component formats for Buffer UAVs - this might need some thought in the future as the API evolves but we might as well take advantage while we can.
Change 4006495 by Daniel.Wright
Integrate from Refactor branch
* New FMaterialRenderProxy function GetMaterialWithFallback which provides both the FMaterialRenderProxy and FMaterial. Needed when falling back to default material, so that proxy and material resource match.
* Local vertex factory uniform buffer
Change 4006851 by Brian.Karis
Fix for joined charts forming an L to inflate both axii.
Thanks to Jess Kube of The Coalition.
Change 4006852 by Brian.Karis
Fix for hard coded reflection capture cube map size. Should fix light static light aliasing in captures
Change 4006918 by Brian.Karis
New ByteBuffer functionality. Memcpy and scatter upload. Can implement GPU side TArray reflection.
Not yet used by checked in code. WIP optimization.
Change 4007246 by Guillaume.Abadie
Creates lower quality permutation for DOF's gathering pass, without Coc based weighting of the samples, and lower number of gathering ring for fast accumulator.
Change 4007291 by Guillaume.Abadie
Exposes more DOF scalability settings.
Change 4007328 by Guillaume.Abadie
Optimises DOF's half res only setup pass using gather4
Change 4007627 by Richard.Wallis
Fix for when Magic Mouse cannot zoom in World Composition editor. Missing default SNodePanel::OnMouseMove behaviour. Tested using a classic 2xbutton + wheel mouse and a Mac MagicMouse.
#jira UE-57030
Change 4007682 by Richard.Wallis
No video when playing HLS streaming video on Mac. 2 Issues, FPS was zero making duration for video sample buffer nonsense and Video Track dimensions were going to zero on the AVAsset once fully initialized when playing HSL streams. Now cache relevant details and handle zero frame rate.
Notes:
- Caching the frame rate is not as important as we could look it up each time and fix for zero - ignoring that at the moment.
- Assume we DO NOT want the FrameSize to be the last fetched video frame size from the AvfMediaVideoSampler as I think that is the video quality for streaming video and not the media frame size.
- Renamed a variable in the AvfMediaVideoSample - was called FrameRate but it was the FrameDuration by that point.
#jira UE-56734
Change 4007731 by Rolando.Caloca
DR - Disable byte buffers on non-hlsl based platforms
#jira UE-57851
Change 4007741 by Rolando.Caloca
DR - Disable byte buffers on hlslcc platforms
Change 4007782 by Mark.Satterthwaite
Force Metal shaders, including the stdlib, to recompile.
Change 4007918 by Rolando.Caloca
DR - vk - Some static asserts
Change 4008404 by Arciel.Rekman
Do not crash on incompatible Vulkan drivers (UE-57521).
#jira UE-57521
Change 4008442 by Daniel.Wright
Better comments on ERHIFeatureLevel expectations
Change 4008494 by Arne.Schober
DR - moved bDeletedThroughDeferredCleanup before begincleanup to catch cases where the reference is added twice to the array. also removed finishcleanup as all they ever did was deleting the pointer anyway, and it sould be adfded if such functionallity is ever required fom outside of the regular destructor.
#jira UE-57754
Change 4008730 by Mark.Satterthwaite
After the most recent changes to handling uniform buffer dirty bits in MetalRHI we should guard against attempts to set an unbound uniform buffer.
#jira UE-57870
Change 4008949 by Brian.Karis
Fix compile warning
Change 4008951 by Brian.Karis
Added LTC LUT textures
Change 4009326 by Guillaume.Abadie
Compiles out DOF's gathering bokeh simulation on platform other than desktop.
Change 4009380 by Krzysztof.Narkowicz
Moved area light code before the contact shadows, so contact shadows use representative light's direction.
Merged all contact shadows shader code.
Contact shadows keep constant screen space length independent of FoV settings.
Contact shadows for translucents.
Contact shadows for eye.
Change 4009555 by Guillaume.Abadie
Splits DOFCocTile.usf in two.
Change 4009999 by Yuriy.ODonnell
MallocStomp can now be enabled on certain platforms using '-stompmalloc' command line argument.
Previously it was necessary to modify MallocaStomp.h and re-compile the engine.
Currently supported platforms: Win64, Mac, Linux.
Replaced hard-coded page size with FPlatformMemory::GetConstants().PageSize.
Change 4010288 by Rolando.Caloca
DR - vk - Fix for vertex streams
Change 4010289 by Krzysztof.Narkowicz
D3D12 - fixed depth bounds bug, where depth bounds wasn't properly set to [0;1] after disabling.
#jira UE-57510
Change 4010297 by Rolando.Caloca
DR - vk - Remove some functions for android
Change 4010315 by Rolando.Caloca
DR - vk - Remove create info macro
Change 4010451 by Rolando.Caloca
DR - vk - Reuse samplers
- Infiltrator goes from 5759 to 24 samplers!
Change 4010627 by Rolando.Caloca
DR - vk - Fix missing values for tracking swapchain validation
Change 4011924 by Guillaume.Abadie
Implements tile based early return optimisation on DOF's postfiltering method.
Change 4011941 by Guillaume.Abadie
Shaves some ALU in DOF's accumulator for LowQuality permutation.
Change 4012093 by Yuriy.ODonnell
Disable MallocStompOverrunTest() in static analysis config, as it intentionally performs an out-of-bounds access.
Change 4012195 by Rolando.Caloca
DR - vk - Fix for mobile backbuffer layout
Change 4012202 by Rolando.Caloca
DR - vk - Don't use staging buffers on UMA
Change 4012467 by Rolando.Caloca
DR - Remove redundant check
Change 4012486 by Rolando.Caloca
DR - Fix missing transition
Change 4012518 by Guillaume.Abadie
Implements fast shader permutation for DOF's TAA pass.
Change 4013084 by Arciel.Rekman
Fix for Linux clock discrepancy.
- Causing at least one precision issue, possibly more.
(Edigrating 4003273, 4012462 from //UE4/Dev-Editor/... to //UE4/Dev-Rendering/...)
Change 4013266 by Uriel.Doyon
Fixed crash when setting SceneDepthTextureNonMS and not having valid depth buffers in the SceneContext.
Change 4013626 by Uriel.Doyon
Fixed crash in the lighting build when creating a blueprint of the ALight and placing a light component in it.
#jira UE-51672
Change 4013805 by Rolando.Caloca
DR - Fix more missing transitions
Change 4014128 by Arne.Schober
DR - Do not create LocalVFUniformBuffer when running without MVF
#jira UE-57929
Change 4014193 by Uriel.Doyon
Editing component transforms now invalidate the component's lighting cache.
#jira UE-48134
Change 4014282 by Rolando.Caloca
DR - vk - Remove extra validation during dump
Change 4014584 by Uriel.Doyon
Duplicated static meshes now generate a new GUID to prevent possible issues with lightmass.
#jira UE-49064
Change 4014604 by Uriel.Doyon
UStaticMesh postduplicate now only generates a new GUID if !bDuplicateForPIE.
Change 4015460 by Guillaume.Abadie
Composes separate translucency within DOF's recombine pass.
Change 4015571 by Guillaume.Abadie
Refactors tonemapper to use global shader permutation API, that adds permutation for HDR output device rather than dynamic branching that some shader compiler are not very well optimizing.
Change 4015984 by Krzysztof.Narkowicz
Fixed crash inside DFAO resource allocation, when DFAO viewport has zero area.
#jira UE-58000
Change 4016056 by Mark.Satterthwaite
Fix Mac Metal shader compilation of texture cube arrays.
Change 4016062 by Richard.Wallis
Convert things like Space, Delete, F6 etc to unicode so they display correctly on the Mac menu rather than first letter of word. Added the default Mac commands to the GenericCommands so we get a Chord overwrite message and stop things like cmd+ q / w / h from getting bound.
#jira UE-46999
Change 4016109 by Mark.Satterthwaite
One unified Metal buffer implementation - will make further changes a heck of a lot easier.
Change 4016221 by Patrick.Kelly
UE-57617: Ensure changing viewmode to ShaderComplexity while in -game
Change 4016238 by Guillaume.Abadie
Makes clang happy again in Tonemapper.
Change 4016309 by Mark.Satterthwaite
More *_RenderThread implementations for MetalRHI.
Change 4016414 by Mark.Satterthwaite
And MetalRHI version of CreateStructuredBuffer_RenderThread...
Change 4016498 by Mark.Satterthwaite
Don't hold on to the uniform buffers bound to the hull shader when switching to a tessellated draw call as they'll have the wrong buffer layout.
#jira UE-57930
Change 4017394 by Juan.Canada
OpenGL: Fixed shading artifacts due incorrect UNORM/SNORM conversions in skin/skincache/computetangent shaderss.
#jira UE-57691
Change 4017522 by Rolando.Caloca
DR - vk - Remove unused code path (old mip generation detection)
Change 4017539 by Rolando.Caloca
DR - vk - Fix for sky lighting mips showing green on AMD
Change 4017542 by Arciel.Rekman
Moved appCountTrailingZeros to a non-SSE header (fixes ARM64 build).
- Arguably WITH_SLI shouldn't apply to Linux on ARM but the fact that the function wasn't available is bad on its own.
Change 4017827 by Guillaume.Abadie
Optimises DOF's scattering cost by a third.
Change 4017835 by Rolando.Caloca
DR - Only allow a render pass to generate mips for one color render target
Change 4017889 by Mark.Satterthwaite
Cache all the Metal state objects to avoid hitting the API unnecessarily.
Change 4018251 by Mark.Satterthwaite
Fix broken rendering on Metal that tracked back to the innocuous looking changes in CL #4006495 (no blame attached - these changes are entirely reasonable) and cause various bugs in QAGame's TM-DistanceFields, ElementalDemo and probably more. Doesn't fix broken SpeedTree rendering :(.
MetalRHI was allowing uniform buffers to blow away linear texture buffers when the constant buffer has been elided due to dead-code elimination. This problem can manifest without linear textures if the uniform buffer contains both constant data and a resource-table but the shader doesn't use any of the constant data. That's because Metal doesn't separate constant buffers from any other kind of buffer unlike D3D which separates all the slots out - and Metal doesn't provide enough buffers to emulate the D3D arrangement. So far this has only manifested in the MVF + Linear Texture case but a more robust solution will be necessary long term.
Change 4018514 by Guillaume.Abadie
Implements r.DOF.Scatter.MinCocRadius.
Change 4018553 by Guillaume.Abadie
Implements r.DOF.Scatter.MaxSpriteRatio to control the budget upperbound of DOF's scattering
Change 4020369 by Yuriy.ODonnell
Disable MallocStompOverrunTest in all static analysis configs (using USING_CODE_ANALYSIS macro)
Previously was only disabled for PVS-Studio.
Change 4020620 by Arciel.Rekman
Fix XboxOne CIS (fallout of appCountTrailingZeros move).
Change 4020949 by Guillaume.Abadie
Configures DOF in scalability settings.
Change 4021593 by Rolando.Caloca
DR - vk - Support for Aftermath style api on AMD
Change 4021740 by Rolando.Caloca
DR - vk - Change log output
Change 4022008 by Uriel.Doyon
Fixed renderthread stalls when streaming texture mips on low end systems.
Change 4022135 by Rolando.Caloca
DR - vk - Fix last mip's layout during mip chain creation
Change 4022607 by Jian.Ru
Speculative fix for a bug where an invalid vertex buffer is deferenced
#jira UE-56229
Change 4022890 by Rolando.Caloca
DR - Fix reference count not getting released
Change 4023540 by Mark.Satterthwaite
Avoid some pointless retain/release calls on Metal Encoders.
Change 4023796 by Marcus.Wassmer
Tell users they are over the maximum size when allocating very large rendertargets.
Change 4025337 by Yuriy.ODonnell
Improved use-after-free detection mechanism and physical memory usage of MallocStomp on Windows.
MallocStomp on Windows will now reserve virtual address space for every allocation and then commit physical pages only to the valid usable part.
Physical pages will be unmapped on Free, but virtual address space will not be released and therefore will never be re-used.
Virtual address space is allocated from the OS in blocks of 1GB and then linearly sub-allocated.
This reduces VA space usage, as VirtualAlloc returns blocks on 64KB granularity even if we just need 4KB. As a small bonus, this also reduces number of syscalls per allocation.
This dramatically increases accuracy of use-after-free detection, but consumes significant amount of memory for the OS page table.
Virtual memory limit for a process on Win10 is 128 TB, which means we can afford to keep virtual memory reserved for a long time.
Running Infiltrator demo consumes ~700MB of virtual address space per second.
Additionally, committing physical pages only for the usable part of the entire virtual block reduces physical memory usage by ~30% compared to old behavior,
which allocated and committed entire block of pages via BinnedAllocFromOS and then marks border page as non-accessible.
Change 4026047 by Rolando.Caloca
DR - Fix test/shipping
#jira UE-58148
Change 4026150 by Krzysztof.Narkowicz
Force proper ordering of buffer visualization materials - after tonemapping (so exposure doesn't influence it) and before editor stuff like icons.
#jira UE-57992
Change 4026226 by Rolando.Caloca
DR - Fix static analysis
#jira UE-58150
Change 4026354 by Jian.Ru
Debug check trying to catch a crash. Only enabled in editor build
#jira UE-50111
Change 4026655 by Rolando.Caloca
DR - Fix for static analysis
#jira UE-58149
Change 4026763 by Rolando.Caloca
DR - Remove references to defunct CCT to avoid confusing licensees
Change 4027167 by Uriel.Doyon
Fixed possible out of bound buffer access when serializing with FDuplicateDataWriter.
#jira UE-56509
Change 4027850 by Jian.Ru
Prevent log spam
#jira UE-50111
Change 4029546 by Rolando.Caloca
DR - Compile fixes
Change 4029624 by Yuriy.ODonnell
Addressed static analysis errors in MallocStomp
- VirtualAlloc return value is now explicitly checked.
- C6250 is suppressed, as VirtualFree does not release address space by design.
Change 4030225 by Yuriy.ODonnell
Static analysis warning fix: make sure declaration of Sleep() is consistent between Windows headers and TBB
The complexity with this particular case is that the warning is generated in synchapi.h, which is included by some Windows headers.
If a module includes TBB and then Windows platform headers, static analyzer will report this warning.
Suppressing it would require wrapping all instances of Windows header includes in third-party macros.
Current pragmatic solution is to modify the Sleep() declaration in TBB header to be consistent with Windows and to report the issue to Intel for a permanent fix.
Change 4030440 by Rolando.Caloca
DR - Fix crash on mobile
#jira UE-58222
Change 4030570 by Daniel.Wright
Allow null SRV's in uniform buffers for feature levels that don't support SRV's in shaders
Change 4030618 by Arne.Schober
DR - missing tangent/normal sign conversion after integration from main
#jira UE-58224
Change 4031588 by Rolando.Caloca
DR - vk - Fix compile error when missing vkCmdWriteBufferMarkerAMD
Change 4032145 by Mark.Satterthwaite
Fix UE-58268 by only emitting the base_instance/base_vertex variables required to fix-up the instance/vertex ID values to match D3D when the Metal version is 1.1 or higher, earlier versions don't support these features.
#jira UE-58268
Change 4032209 by Rolando.Caloca
DR - Fix crash on EngineTest: Mesh Batch's UserIndex is not a union anymore
Change 4033178 by Guillaume.Abadie
Fixes FXAA sampling outside viewports, that was causing black outline on bottom and right edge of the screen when ViewSize != BufferSize, problematic for some screenshot automated test.
#jira UE-58151
Change 4034489 by Daniel.Wright
Fixed UStaticMeshComponent modifying its UStaticMesh when undoing a change. This caused a crash when other static mesh components using the same mesh asset were rendered, since their rendering state was not recreated. A component should not modify its asset during PostEditUndo.
* This behavior has been present for a long time but was previously hidden because only the vertex factory of the mesh asset is cached in static draw lists, not any of its rendering resources (eg vertex declaration).
Change 4035157 by Uriel.Doyon
Fixed deadlock in the streaming code when running with -onethread.
#jira UE-58299
Change 4035198 by Rolando.Caloca
DR - vk - Fix issue when an older SDK was installed, UBT would pick it (should pick the newer of ThirdParty\Vulkan or installed SDK).
#jira UE-58267
Change 4035730 by Arne.Schober
DR - Fix missing Fog parameters during LightScattering Injection
#jira UE-57608
Change 4035843 by Daniel.Wright
Reimplemented support for EyeAdaptation node in opaque materials
Change 4036837 by Marcus.Wassmer
Replace some of the screenshots to match new un-tonemapped buffer visualization
Change 4036980 by Rolando.Caloca
DR - vk - Fix deadlock contention during mem allocation on Linux
Change 4037225 by Guillaume.Abadie
Fixes jittering selection outline.
#jira UE-58350
Change 4038056 by Marcus.Wassmer
roll back changelist 4026150. breaks a bunch of automated tests by cutting off half the image.
Change can go back in later with that part fixed also
Change 4038296 by Jian.Ru
Static analysis fix
#jira UE-58377
Change 4038402 by Ben.Marsh
Suppress IncludeTool warnings caused by CL 3998947.
Change 4038514 by Arne.Schober
DR - Fix case with MVF where instance offset is not supported by the API (in this case only foliage OpenGL and TvOS), usually the buffers are offsetted instead but with MVF we do not use offsetted buffers, therfore the offset needs to be passed into the shader although we are drawing with offset of 0.
#jira UE-57652
Change 4038747 by Marcus.Wassmer
Back out changelist 3853645, causing us to lose shadows in the shaderhair test
Change 4040138 by Rolando.Caloca
DR - Fix compile warning
Change 4041614 by Rolando.Caloca
DR - vk - Fix for Oculus module
#jira UE-58267
Change 3810277 by Daniel.Wright
Ray Traced Distance Field shadows use a two pass tile culling algorithm with no tile max - fixes flickering from tile overflow in dense areas or with a low sun angle. Costs .2ms on PS4.
The distance field scene buffers now use float4 on PS4 and Xbox, saves .1ms on PS4.
Change 3817029 by Uriel.Doyon
Added UVolumeTexture, which use 3D textures. Compressed formats are supported on DX11, DX12, PS4 and XB1.
Projects targetting OpengGL don't have access to compressed formats (as the implementation has texture tiling issues).
Add "r.AllowVolumeTextureAssetCreation" set as 0 by default, which controls whether volume texture can be sampled in materials and whether they can be created from 2D texture assets.
Platform not supporting BC7, will now fallback on RGBA8 instead of DXT to preserve quality, in an attemps to increase usage of BC7.
#jira UE-32263
Change 3819960 by Michael.Lentine
Expose UEPhysics Clothing Parameters through UI.
Change 3823401 by Rolando.Caloca
DR - Add NumQueriesInBatch to RHIBeginOcclusionQueryBatch
Change 3844805 by Arne.Schober
DR - Increased Intermediate normal of Umodel and Skelmesh from 8bit Unorm Compressed to float. A resave/rebuid/reimport of the meshes is recommended to recover some lost precision.
Fixed an issue with compressed (packed) normals on the GPU which were off by one integer representation. Also switched from UNORM to SNORM to get a discrete zero representation and removed some mads from all the VertexShaders.
Change 3847283 by Marcus.Wassmer
Extra fixes from Uriel
Change 3876607 by Rolando.Caloca
DR - Use render passes when running occlusion queries
- Removes the RHI(Begin|End)OcclusionQueryBatch API
Change 3903799 by Daniel.Wright
[Integrate] Pass Uniform Buffers
* All pass-constant shader inputs should go into the appropriate pass uniform buffer, instead of being set per-draw
* Moved many per-draw base pass parameters over to the Base Pass Uniform Buffer
* Opaque and Translucent base pass shaders have different uniform buffers, which allows compile errors when accessing an invalid resource (eg GBuffer in Opaque), instead of silently falling back to GBlackTexture
Uniform buffers can now contain nested structs with UNIFORM_MEMBER_STRUCT()
* This allows composing a uniform buffer at a particular update frequency out of many features, with encapsulation of each feature's parameters in a struct.
* Eg deferred fog uses FFogUniformParameters, but so does translucency in the base pass, where FFogUniformParameters is reused nested inside the base pass uniform buffer.
* Resources can now be located anywhere in the uniform buffer. Padding is inserted to the cbuffer representation to keep memory layouts matching. In the future the cbuffer could be compacted.
* RemoveUniformBuffersFromSource() which works around HLSLCC lack of struct initializers now handles nested structs
Change 3917500 by Rolando.Caloca
DR - Change depth bounds so only the enable bit is in the PSO, allow min/max to be dynamically modified
Change 3964907 by Guillaume.Abadie
Implements RectList topology support in RHI.
Change 3979171 by Mark.Satterthwaite
Copying //Tasks/UE4/Dev-UERNDR-354-mtlpp to Dev-Rendering (//UE4/Dev-Rendering):
Rewrites MetalRHI in terms of mtlpp, which is a C++ wrapper library built around Metal's Objective-C API that attempts to reduce overheads and eliminate resource lifetime errors.
Regarding mtlpp:
- The mtlpp library uses C++ constructor/destructor and smart-pointer style management of Objective-C retain/release calls to prevent over- and under-release problems.
- To reduce Objective-C overheads the mtlpp library caches the internal C-function that implements the Objective-C selectors for the most commonly used Metal protocol types and calls the function directly - this avoids objc_msgSend which does this look-up dynamically and thus improves CPU performance slightly.
- Another advantage is that mtlpp provides infrastructure to extend the Metal API slightly to help improve MetalRHI - the two important aspects are mtlpp::CommandBufferFence which provides a consistent CPU<->GPU synchronisation primitive and sub-buffer allocations from mtlpp::Buffer which allow for far superior memory management.
- Validation functionality is also provided by mtlpp to detect CPU vs. GPU data races and resource lifetime validation - this is expensive and is thus optional and compiled out from Shipping binaries that should be used when performance is most critical. The validation only works between resource modification and *submitted* command-buffers - anything that is being actively encoded on the CPU is ignored and it remains the responsibility of the application to validate the order of operations when encoding.
Apple Platform:
- LLM support which tracks Objective-C objects is enabled only on macOS - we don't have the necessary libraries to intercept and override the internal system calls on iOS.
MetalRHI:
- All the types are switched over, (mostly) insuling the external API from the horror of Metal and Objective-C.
- Buffers are now managed quite differently, small buffers are allocated from a magazine allocator that allocates in fixed blocks from a larger parent buffer, intermediate sized buffers are allocated from a simple heap allocator that wraps a larger buffer and anything of reasonable size (>2Mb) will use the pooled allocator. This *radically* reduces the number of buffer resources, by as much as a factor of 10, because they are now sub-allocated without the need to use MTLHeap or MTLFence so they are performance equivalent to the existing implementation on the GPU and much faster on the CPU. Total memory use is approximately the same.
- Vertex & index buffer management has been updated to reflect changes in the management and to avoid reallocating buffers which provide a Linear Texture (for SRVs) unless strictly necessary. This ensures that even in cases where a dynamic buffer is updated multiple times in a frame it will still work acceptably well.
- The Metal ring-buffer implementation is completely different again, this time it can use Managed memory on macOS which allows for much better performance on eGPUs which will be more and more important for Mac.
- Everyone that needs to wait on a command-buffer fence (rather than a command-buffer itself) now use mtlpp::CommandBufferFence, which prevents race conditions between the different command-buffer handlers (which sometimes execute out of order).
- LLM tracking should now report the same data as the MetalRHI stats group for buffer & texture allocations - there is no segmentation for Vertex/index/Structured/Uniform allocations in Metal so these numbers are going to be wrong and will need to be rethought.
- What will be unseen are the number of small but important resource usage fixes that avoid stale resources from being bound to the device after the point at which they become invalid. This should eliminate a class of errors where the GPU uses a resource pointer that is modified by the CPU and was necessary to satisfy the new mtlpp validation code.
Other:
- Remove the Metal focused workarounds from the ClothBuffer resource binding and related vertex-buffer SRV - these were put in when MetalRHI/MetalShaderFormat couldn't handle float->uint conversions correctly and they should now.
- Fix a validation error caused by trying to render a 0-sized scissor rect which is invalid in Metal and simply pointless elsewhere.
- Consistency of disabling the Manual Vertex Fetch behaviour in shaders.
#jira UERNDR-354
Change 3979312 by Rolando.Caloca
DR - Remove bogus bKeepOriginalSurface parameter in CopyToResolveTarget
Change 4005122 by Rolando.Caloca
DR - Support for PS4 Index Buffer UAVs
Change 4016298 by Guillaume.Abadie
Fixes DOF hybrid scattering on platforms that supports RectList topology.
Change 4018575 by Guillaume.Abadie
Optimises DOF's reduce pass when doing scattering compilation.
Change 4020317 by Guillaume.Abadie
Implements WaveBroadcastIntrinsics.ush.
[CL 4042226 by Marcus Wassmer in Main branch]
2018-05-01 10:36:33 -04:00
ViewUniformShaderParameters . GlobalVolumeCenterAndExtent [ Index ] = GlobalDistanceFieldInfo . ParameterData . CenterAndExtent [ Index ] ;
ViewUniformShaderParameters . GlobalVolumeWorldToUVAddAndMul [ Index ] = GlobalDistanceFieldInfo . ParameterData . WorldToUVAddAndMul [ Index ] ;
2020-09-15 11:03:59 -04:00
ViewUniformShaderParameters . GlobalDistanceFieldMipWorldToUVScale [ Index ] = GlobalDistanceFieldInfo . ParameterData . MipWorldToUVScale [ Index ] ;
ViewUniformShaderParameters . GlobalDistanceFieldMipWorldToUVBias [ Index ] = GlobalDistanceFieldInfo . ParameterData . MipWorldToUVBias [ Index ] ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3091903)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3072947 on 2016/08/01 by Uriel.Doyon
Texture GUIDs are now included in cooked builds, as they are required by the texture streamer to link build data to in game texture.
#review-3072934 @marcus.wassmer
#jira UE-34045
Change 3073301 on 2016/08/02 by Ben.Woodhouse
Fix for large spotlight culling precision issues, reported on UDN by Aaron Jacobs at Double Fine. For a full description, see the UDN post
https://udn.unrealengine.com/questions/305440/shadowed-light-flicker-caused-by-floating-point-pr.html
#jira UE-34052
Change 3073689 on 2016/08/02 by Ben.Woodhouse
Improved skin postprocess - support for full resolution, with diffuse/spec lighting combined into single RGBA (sharing chroma)
Full res lighting gives less temporal AA flickering, sharper diffuse and specular lighting in the surface (since this is now at full resolution), faster postprocessing if using a 64-bit rendertarget (on NV 980Ti).
Checkerboard rendering is controlled via the r.sss.checkerboard cvar. - 0 is off/full res, 1 is checkerboard, 2 is automatic based on scenecolor (non-checkerboard requires 64bit or more rendertarget w/separate alpha)
Tested/profiled on PC, PS4
Change 3074666 on 2016/08/02 by Daniel.Wright
Fixed stationary skylight brightness
Change 3074667 on 2016/08/02 by Daniel.Wright
Fixed r.ReflectionEnvironmentLightmapMixing
Change 3074687 on 2016/08/02 by Daniel.Wright
Disallowed DrawMaterialToRenderTarget and Begin/EndDrawCanvasToRenderTarget in construction scripts, since they don't work in game. Blutilities can be used to do blueprint rendering in the editor.
Change 3075241 on 2016/08/03 by Rolando.Caloca
DR - Fix linux compile issue & static analysis warning
Change 3075746 on 2016/08/03 by Daniel.Wright
Removed bOverride_AntiAliasingMethod and outdated ini references to PP AntiAliasingMethod
Change 3075783 on 2016/08/03 by Ryan.Brucks
#code.review Marcus.Wassmer
Added two material nodes that return Atmospheric Light Vector and Light Direction using:
View.AtmosphericFogSunColor
View.AtmosphericFogSunDirection
Nodes are called:
AtmosphericLightVector
AtmosphericLightColor
Also changed SceneRendering.cpp so that values will be grabbed from directional lights without needing an Atmospheric Fog actor in the scene.
Change 3075969 on 2016/08/03 by Uriel.Doyon
Material GUIDs are not updated anymore when parents or textures change.
Lighting now uses a hash built from the list of parents, textures and shader functions.
#review-3072980 @marcus.wassmer @daniel.wright
Change 3076116 on 2016/08/03 by Ryan.Brucks
#code.review marcus.wassmer
Fixed typo in the Caption of new Nodes "Atmospheric Light Vector" and "Atmospheric Light Color"
Change 3076456 on 2016/08/03 by Rolando.Caloca
DR - Fix geometry shader gl_Layer for SPIR-V
Change 3076730 on 2016/08/03 by Uriel.Doyon
Added user warning logic for the texture streaming build. Ran in MapCheck, BeginPlay and PreSave.
#review-3072984 @marcus.wassmer
Change 3077616 on 2016/08/04 by Daniel.Wright
Planar reflection show flags can now be edited
Change 3077621 on 2016/08/04 by Daniel.Wright
Changed default Planar Reflection DistanceFromPlaneFadeoutEnd from 600 to 100, which reduces artifacts and is a more intuitive initial setting
Change 3077792 on 2016/08/04 by Daniel.Wright
Fixed an unnecessary sky capture caused by the sky light component owned by the default ASkyLight
Change 3077799 on 2016/08/04 by Daniel.Wright
Skip RF_ArchetypeObject for reflection captures
Change 3077876 on 2016/08/04 by Marc.Olano
Noise material perf improvements
Change random number generator for Gradient-ALU (1.7x perf boost), improve speed of Voronoi noise quality level 3.
Removes integer BBS random number generators. Fewer instructions, but too slow to use (see 1.7x perf boost above)
Change 3077884 on 2016/08/04 by Daniel.Wright
Lighting channels can now be edited on components with static mobility, since dynamic lights can still affect them
Change 3078994 on 2016/08/05 by Simon.Tovey
Fix for UE-34241
Scene proxy ptr was being cached during a downcast.
Inside a call to CreateDynamicData, CheckMaterialUsage_Concurrent() was causing the scene proxy to be recreated an so the cached ptr was stale.
I've fixed the immediate issue but recreating the scene proxy here doesn't seem great. Maybe CheckMaterailUsage() should be rethought a bit.
Change 3079162 on 2016/08/05 by Ben.Woodhouse
Fix for jittering in Paper2D. Was caused by override being ignored due to a change in intiialization order for AA settings.
#jira UE-34091
Change 3079613 on 2016/08/05 by Daniel.Wright
New blueprint function ClearRenderTarget2D, which is the only way to set a render target alpha directly
New blueprint function CreateRenderTarget2D
Change 3079708 on 2016/08/05 by Uriel.Doyon
Fixed crash when building texture streaming on some levels.
Change 3079795 on 2016/08/05 by Uriel.Doyon
Fixed issue with instanced static meshes when building texture streaming.
Fixed typo with func "GetNumTextureStreamingPrimitives"
Change 3079806 on 2016/08/05 by Uriel.Doyon
Enabled PerTexture MipBias. The per texture mip bias now resets to 0 when the texture gets required at low resolution.
New scalability setting named "r.Streaming.LimitPoolSizeToVRAM" enabling the PoolSize to be limited the available VRAM
(according to GPoolSizeVRAMPercentage)
#review-3074662 @marcus.wassmer
Change 3082698 on 2016/08/09 by Daniel.Wright
Copy - CreateRenderTarget2D uses a world context object as owner, allows use in a construction script
Change 3082699 on 2016/08/09 by Daniel.Wright
Changed display name for 'Two Sided' shading model to 'Two Sided Foliage' to make it clear what it's intended to be used for
Change 3083909 on 2016/08/10 by Olaf.Piesche
#jira UE-34106
#jira UE-32784
#jira UE-31198
Reset vertex factories on mesh emitters if mesh has been reimported (if mesh package is dirty)
Change 3084645 on 2016/08/10 by Olaf.Piesche
#jira UE-30398
Fix offset added to particle collision locations.
Change 3084709 on 2016/08/10 by Daniel.Wright
Copy - Scene capture alpha is now inverted to match DrawMaterialToRenderTarget, and to allow compositing with existing render target contents
Added CompositeMode to SceneCapture2D, which can be used to addively accumulate or composite instead of the default overwrite behavior
Added bCaptureOnMovement to SceneCapture, which can be disabled so the only source of scene capturing is a manual capture by calling CaptureScene()
Change 3084783 on 2016/08/10 by Rolando.Caloca
DR - Use the first targeted rhi shader platform as the initial RHI to load on Windows
#jira UE-34510
Change 3084958 on 2016/08/10 by Daniel.Wright
Copy - Reverted cl 2938543 "Lightmass now respects owner bHidden, and bCastHiddenShadow" because it did not have backwards compatibility so breaks content using hidden light cards
Change 3086023 on 2016/08/11 by Marcus.Wassmer
Merging //UE4/Dev-Main@3085468 to Dev-Rendering (//UE4/Dev-Rendering)
#test none
Change 3086778 on 2016/08/11 by Ben.Woodhouse
Workaround for fortnite character rendering issue. Enable checkerboard rendering by default until we can fix properly
#jira UE-34561
Change 3087404 on 2016/08/12 by Rolando.Caloca
DR - Upgrade glslang to 1.0.21.1
- Added some more debug output
Change 3087524 on 2016/08/12 by Rolando.Caloca
DR - vk - Fixed StencilRef, fixed size of RHIReadSurfaceFloatData (but still returns dummy data)
Change 3087663 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix for SRGB; support for mip texture views
Change 3087735 on 2016/08/12 by Daniel.Wright
TextureRenderTarget2D's can now be up to 8192^2. Anything over 2048 pops up an 'are you sure' dialog.
Change 3087750 on 2016/08/12 by Rolando.Caloca
DR - vk - Minor renaming in prep for merge
Change 3087813 on 2016/08/12 by Rolando.Caloca
DR - vk - More minor cleanup
Change 3087819 on 2016/08/12 by Chris.Bunner
Check material function input types directly, no need to traverse connected graph.
#jira UE-32134
Change 3087901 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix RT view to use 1 mip
Fix depth buffer component swizzle
Change 3088193 on 2016/08/12 by Daniel.Wright
DFAO and RTDF shadows are enabled in High and Epic scalability settings by default
Change 3088988 on 2016/08/15 by Rolando.Caloca
DR - Add Accessors
Change 3089104 on 2016/08/15 by Olaf.Piesche
#jira UE-34241
Sceneproxy can be nullptr in FDynamicMeshEmitterData::Init if the proxy is being recreated
Change 3089208 on 2016/08/15 by Daniel.Wright
Downsampled separate translucency uses a separate view uniform buffer with correct buffer sizes
* Fixes WorldPosition in downsampled translucency
* View uniform buffer parameters are now cached on the view, to allow recreating the uniform buffer without having to rebuild the entire struct. Currently used by global distance field, downsampled separate translucency.
* Fixed the downsampled translucency depth buffer being full res used together with a smaller color target, now they are both the downsampled res
Change 3089209 on 2016/08/15 by Daniel.Wright
Fixed atmospheric fog on translucency
Change 3089457 on 2016/08/15 by Daniel.Wright
Fixed lighting build failure from UMaterialInstanceDynamic assigned to a mesh that's being exported to Lightmass. The Swarm cache entry is created using the parent's guid, causing multiple MID's with the same parent to acquire a file handle multiple times which fails after the first.
Change 3089549 on 2016/08/15 by Daniel.Wright
UMaterialInterface initializes LightingGuid to something valid - causes UMaterialInstanceDynamic to have a valid LightingGuid so they can be used in lighting builds
Change 3089703 on 2016/08/15 by Daniel.Wright
Custom expression fixup for View.RenderTargetSize
Change 3090546 on 2016/08/16 by Daniel.Wright
Hopeful fix for recycled snapshot view crash
Change 3091202 on 2016/08/16 by Daniel.Wright
Manually clear FViewInfo::CachedViewUniformShaderParameters on creating a snapshot, since memcpy is used to create the snapshot view
[CL 3091931 by Gil Gribb in Main branch]
2016-08-17 11:38:13 -04:00
}
2020-09-15 11:03:59 -04:00
ViewUniformShaderParameters . GlobalDistanceFieldMipFactor = GlobalDistanceFieldInfo . ParameterData . MipFactor ;
ViewUniformShaderParameters . GlobalDistanceFieldMipTransition = GlobalDistanceFieldInfo . ParameterData . MipTransition ;
2020-09-08 17:44:06 -04:00
ViewUniformShaderParameters . GlobalDistanceFieldClipmapSizeInPages = GlobalDistanceFieldInfo . ParameterData . ClipmapSizeInPages ;
2022-02-02 07:59:31 -05:00
ViewUniformShaderParameters . GlobalDistanceFieldInvPageAtlasSize = ( FVector3f ) GlobalDistanceFieldInfo . ParameterData . InvPageAtlasSize ;
2022-03-01 21:07:45 -05:00
ViewUniformShaderParameters . GlobalDistanceFieldInvCoverageAtlasSize = ( FVector3f ) GlobalDistanceFieldInfo . ParameterData . InvCoverageAtlasSize ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 4041614)
#lockdown Nick.Penwarden
============================
MAJOR FEATURES & CHANGES
============================
Change 3774677 by Arne.Schober
DR - Deprecated SetLocal from the RHICmdlist
Fixed some unnecessary PSO collisions.
Change 3809579 by Chris.Bunner
Back out changelist 3774677.
#jira UE-53483
Change 3810363 by Mark.Satterthwaite
More random fixes to mtlpp: most important is the extension to Buffer that allows creation of sub-buffers that are merely views onto a sub-range of the parent. These sub-buffers are valid to use throughout the mtlpp API with two exceptions: they may not be used for visibilityResultsBuffers and Set*BufferOffset functions cannot take this offset into account (as the encoder does not hold onto the buffers and I don't want it to). In the case of Set*BufferOffset the caller has to know what is going on and in the case of visibilityResultsBuffers it'll just assert as it isn't sensible.
This makes it *much* easier to do things like sub-buffer allocation, though the caller must be aware of the alignment restrictions of their intended usage as they are not possible to enforce. For example, a call to SetVertexBuffer requires an offset alignment must match the alignment of the data-type in the shader for "device" resources, or for "constant" data it must be max(4, sizeof(datatype)) on iOS and 256 on macOS. This should allow for much more tightly packed sub-allocations than earlier approaches, though older drivers (e.g. Mac OS X 10.11) enforce only the coarser "constant" data restriction everywhere.
Change 3810407 by Marcus.Wassmer
PR #4322: ShadowSetup Bug Fix: Only stencil mask drawn meshes (Contributed by DSDambuster)
Change 3810676 by Guillaume.Abadie
Makes r.Test.SecondaryUpscaleOverride work with any arbitrary pixel size.
Change 3810696 by Guillaume.Abadie
Adds support for #include "../MyFile.ush" in the shader compiler.
Change 3810698 by Guillaume.Abadie
Implements enum class based shader permutation dimension.
Change 3810699 by Guillaume.Abadie
Implements Diaphragm DOF ground work.
Change 3811536 by Guillaume.Abadie
Pulls the trigger on CircleDOF's setup pass for DiaphragmDOF.
Change 3811958 by Mark.Satterthwaite
More fixes for mtlpp.
Change 3811964 by Mark.Satterthwaite
Only views onto a mtlpp::Buffer should return a valid parent-buffer.
Change 3812604 by Guillaume.Abadie
Changes Diaphragm DOF's source file layout.
Change 3812827 by Mark.Satterthwaite
More missing/broken functionality in mtlpp fixed and fixed obvious leaks.
Change 3812920 by Guillaume.Abadie
Adds support for per mip level UAV in FSceneRenderTarget.
Change 3812926 by Mark.Satterthwaite
Change the way we handle mtlpp resource construction to avoid leaks.
Change 3812960 by Rolando.Caloca
DR - vk - Disable DFGI
Change 3812968 by Rolando.Caloca
DR - Linker fix
Change 3813318 by Mark.Satterthwaite
Fix linear texture allocation from a buffer sub-view.
Change 3813326 by Mark.Satterthwaite
Fix another Metal mtlpp sub-buffer allocation failure.
Change 3813328 by Guillaume.Abadie
Removes global samplers in TAA for GL4, Vulkan and Switch.
Change 3813937 by Rolando.Caloca
DR - Fix logs not getting dumped when r.DumpSCWQueuedJobs is on
Change 3813947 by Rolando.Caloca
DR - noshaderworker should override r.XGEShaderCompile
Change 3817017 by Uriel.Doyon
Fixed texture editor black screen
#jira UE-53653
Change 3818568 by Rolando.Caloca
DR - Fix log when shader jobs crash
- Move log10 to common
- Added COMPILER_VULKAN define
Change 3818603 by Uriel.Doyon
Fix to static analysis warning
Change 3818623 by Rolando.Caloca
DR - Workaround hlslcc loop unrolling bug
Change 3819070 by Uriel.Doyon
Fix to stat duplication.
Change 3819105 by Uriel.Doyon
Refactored volume sample shader to avoid using texture dimension.
Change 3819136 by Rolando.Caloca
DR - vk - Per platform files (empty)
Change 3819180 by Rolando.Caloca
DR - vk - Move defines out of config into per platform
Change 3819247 by Rolando.Caloca
DR - vk - Remove more defines into platform settings
Change 3819318 by Rolando.Caloca
DR - vk - Fixes for linking
Change 3819868 by Rolando.Caloca
DR - vk - Linux & Android fixes
Change 3819873 by Guillaume.Abadie
Adds support for PermutationId on r.DumpShaderDebugInfo=1
Change 3819940 by Rolando.Caloca
DR - vk - Fix Linux issues
Change 3819956 by Rolando.Caloca
DR - vk - Invalid check
Change 3819961 by Michael.Lentine
Hide attributes when plugin is not present
Change 3819980 by Rolando.Caloca
DR - vk - Standard validation always
Change 3820039 by Rolando.Caloca
DR - vk - Fix invalid ensure
Change 3820326 by Rolando.Caloca
DR - vk - Linux compile fix
Change 3820422 by Michael.Lentine
Add back GBufferAO.
Change 3820433 by Rolando.Caloca
DR - Fix D3D12 crash on 20 thread (10x2 cores) machines
Change 3821677 by Rolando.Caloca
DR - vk - Win32 compile fix
Change 3821961 by Rolando.Caloca
DR - Vulkan uses real UB by default on non-Android
Change 3821968 by Rolando.Caloca
DR - vk - Update glslang 1.0.65.1
Change 3821969 by Uriel.Doyon
Added support for stat groups that must be sorted by name. Defined by DECLARE_STATS_GROUP_SORTBYNAME.
Change 3821983 by Rolando.Caloca
DR - vk - Change to static array (0.1ms on 10k draw calls)
Change 3824141 by Rolando.Caloca
DR - vk - Fix static analysis
- Bumped up some (c) 2017->2018
Change 3824355 by Rolando.Caloca
DR - vk - Accessor to find out if a cmd buffer has been submitted
Change 3824420 by Rolando.Caloca
DR - Sanity check number of queries per batch on D3D11 as to not break other RHIs
Change 3824463 by Rolando.Caloca
DR - Removed dummy ensure for D3D12
Change 3824609 by Rolando.Caloca
DR - vk - Linux compile fix
Change 3826074 by Mark.Satterthwaite
Start IMP-caching the various descriptor types in mtlpp.
Change 3826098 by Rolando.Caloca
DR - vk - Dump layer compile fixes
Change 3826113 by Rolando.Caloca
DR - vk - Missing dump functions
Change 3826302 by Rolando.Caloca
DR - vk - Compile fix
- Change dump handles to %p
Change 3826635 by Mark.Satterthwaite
Forward declarations required for mtlpp compilation without exposing Metal headers - plus fixes to the mtlpp test compiler.
Change 3827072 by Mark.Satterthwaite
Switch some more mtlpp descriptors over to IMPTables from objc_msgSend.
Change 3827909 by Guillaume.Abadie
Replaces diaphragm DOF's prefiltering with LDS bank coherent bilateral reduction, and implements 1/8 res background gathering pass.
Change 3827952 by Guillaume.Abadie
Updates copy right to year 2018 on diaphragm DOF's new files.
Change 3828055 by Rolando.Caloca
DR - vk - Rename in prep for changes
Change 3828229 by Guillaume.Abadie
Avoids to log multiple time global shader type name that have multiple permutations when verifying global shader map.
Change 3828427 by Guillaume.Abadie
Reimplements Max3x3 gathering post filtering for Diaphragm DOF with proper shader permutation.
Change 3829979 by Guillaume.Abadie
Fixes a color NaN source in diaphragm DOF's TAA pass.
Change 3830116 by Rolando.Caloca
DR - vk - Fix GPU queries/frame time on old system
- New system in place, disabled temporarily
Change 3830169 by Rolando.Caloca
DR - vk - Fix async pso creation crash
Change 3830193 by Rolando.Caloca
DR - vk - CPU RHI thread improvement
Change 3830291 by Guillaume.Abadie
Automatically lower the number of gathering rings on background half res gather pass as far CoC is getting smaller.
Change 3830300 by Rolando.Caloca
DR - vk - Static analysis fix: Split VulkanCommon.h out of VulkanConfiguration.h
Change 3830589 by Mark.Satterthwaite
In mtlpp cache the IMPTables for all the Metal @protocol's that are dependent on the MTLDevice, this avoids a mutex & map lookup. Also make all the concrete types store their IMPTable statically as it won't change.
Change 3830793 by Mark.Satterthwaite
Fix a small number of bugs introduced with the mtlpp descriptor and table caching.
Change 3831491 by Jian.Ru
Fix driver version unknown
#jira UE-53688
Change 3832335 by Rolando.Caloca
DR - vk - Change include
Change 3832550 by Rolando.Caloca
DR - vk - Occlusion query rewrite WIP
Change 3832589 by Rolando.Caloca
DR - vk - Minor refactor to pools in prep for timestamps
Change 3832618 by Rolando.Caloca
DR - vk - Do not block timestamp queries
Change 3832636 by Rolando.Caloca
DR - vk - Fix old timestamp queries
Change 3833138 by Rolando.Caloca
DR - vk - Fix timestamp queries
Change 3833249 by Rolando.Caloca
DR - vk - Test lock
Change 3833667 by Rolando.Caloca
DR - vk - Old queries wait on the RHI thread now instead of the driver (disabled)
Change 3833907 by Daniel.Wright
Fixed NextStartOffset UAV index out of bounds
Change 3833918 by Daniel.Wright
D3D12 RHI: only refcount uniform buffers if GRHINeedsExtraDeletionLatency is false, which is no longer the case for PC or Xbox. The refcounting was heavy on performance as reported by a licensee because FRHIResource uses atomics for refcounting, which is only necessary when GRHINeedsExtraDeletionLatency is disabled.
Change 3834852 by Rolando.Caloca
DR - vk - Missing file
Change 3834858 by Guillaume.Abadie
Implements r.DOF.MinimalFullresBlurringRadius
Change 3834979 by Rolando.Caloca
DR - vk - Fix
Change 3836117 by Rolando.Caloca
DR - vk - Update to 1.0.65.1
Change 3836122 by Rolando.Caloca
DR - vk - Added r.Vulkan.SubmitOcclusionBatchCmdBuffer
- Added new error codes/messages
Change 3836421 by Mark.Satterthwaite
For the purposes of debugging and conformance testing mtlpp make it possible to compile *without* the IMP cache so that we call the underlying Objective-C.
Change 3836896 by Uriel.Doyon
Fixed concurrency and exit issues around d3d12 pipeline states on windows.
Change 3837385 by Rolando.Caloca
DR - vk - Dump memory on OOM
Change 3837427 by Rolando.Caloca
DR - vk - Change some arrays to array views
Change 3837800 by Guillaume.Abadie
Implements SHADER_PERMUTATION_RANGE_INT to make contiguous integer permutations that does not start to 0.
Change 3838128 by Rolando.Caloca
DR - vk - Support for non-cached memory types
Change 3838540 by Guillaume.Abadie
Refactors Diaphragm DOF's CoC tile buffer under a single API for better maintainability.
Change 3838731 by Rolando.Caloca
DR - vk - Descriptor pools per command buffer pool (turned off)
Change 3838961 by Rolando.Caloca
DR - vk - Use ring buffer for per frame uniform buffers
- Enable descriptor pools per layout recycled per command buffer
Change 3839087 by Rolando.Caloca
DR - vk - Compile fixes for Android
Change 3839106 by Marcus.Wassmer
PR #4413: Removing unnecessary call to FString::ToLower (Contributed by gsfreema)
Change 3839252 by Mark.Satterthwaite
Fix mtlpp::Resource move operators.
Change 3839426 by Marcus.Wassmer
Duplicate 380972
Make PC GPU Benchmarks more reliable
Change 3840041 by Guillaume.Abadie
Fixes shader compilation failure in TAA with alpha channel through post processing support.
Change 3840257 by Chris.Bunner
Swapping a mul() to * in HLSLTranslator::Dot to allow scalar transformations per a UDN ticket.
Change 3840308 by Rolando.Caloca
DR - vk - Support for UB & non-UB on emulation mode
Change 3840586 by Rolando.Caloca
DR - Copy 3840577
Fix for CPUs with more than 16 cores
Change 3840671 by Rolando.Caloca
DR - vk - Copy from 3840663
Fix for layout ensure on HMD projects on Vulkan
Change 3840980 by Rolando.Caloca
DR - vk - Android compile fixes
Change 3841989 by Guillaume.Abadie
Slices Diaphragm DOF's Gather pass in multi shader files, and CFLAG_StandardOptimization flag for faster iteration time.
Change 3842216 by Guillaume.Abadie
Fixes DDOF's foreground alpha channel.
Change 3842217 by Guillaume.Abadie
Implements r.DOF.MaximalForegroundBlurringRadius
Change 3842353 by Guillaume.Abadie
Allows to disable foreground gathering with r.DOF.MaximalForegroundBlurringRadius=0
Change 3842747 by Rolando.Caloca
DR - vk - Missing use of GPoolSizeVRAMPercentage
- Support for smaller allocations if page size is not available
Change 3842791 by Rolando.Caloca
DR - vk - Use 95% of available GPU memory to handle some fragmentation
Change 3843690 by Guillaume.Abadie
Fixes diaphragm DOF's foreground after all this refactoring.
Change 3844439 by Guillaume.Abadie
Improves Coc dilate pass to make the gather pass as fast as possible, but still without artifacts caused by the fast gathering optimisation.
Change 3844946 by Mark.Satterthwaite
rd_route v1.1.1 with attached TPS approval.
For macOS function interposition which is useful for debugging and the occasional workaround.
Change 3845164 by Mark.Satterthwaite
Add LLM support for macOS, including tracking of memory allocated in Objective-C. This makes use of runtime method swizzling in the Objective-C runtime and the rd_route library I added for Richard Wallis, which allows for arbitrary runtime function interposition and allows me to hook the custom allocators used in Apple's many Objective-C frameworks on which the whole macOS edifice is built. Objective-C objects are charged to the calling scope as they are too common to impose their own without murdering frame rate.
We would need a TPS approval for an iOS function interposition library for this to work fully on iOS, if desired in the short term discarding LowLevelFree events that aren't in the map rather than asserting will workaround the problem.
Change 3845849 by Marcus.Wassmer
Fix clang and some normal refactor errors
Change 3846026 by Rolando.Caloca
DR - vk - Descriptor set allocation scheme rewrite
- Type hash for each pool
- Desc sets Pool on device
Change 3846169 by Rolando.Caloca
DR - vk - Remove old code for non-layout descriptor set pools
Change 3846205 by Mark.Satterthwaite
Disambiguate the PatchControlPointOut struct definitions in Metal tessellation shaders at Apple's suggestion to avoid a metallib gotcha.
Change 3846346 by Arne.Schober
DR - Missing Vector instructions
Change 3847037 by Arne.Schober
DR - Fix issue with GPU skincache where the offset of the clothbuffer is not relative to the offset of the actual vertexbuffer.
Fixed MorphTarget Skincache Offset mixxup
Change 3847275 by Marcus.Wassmer
Copying MGPU to Dev-Rendering (//UE4/Dev-Rendering)
Change 3847464 by Rolando.Caloca
DR - vk - Fix static analysis warning
Change 3847707 by Michael.Lentine
Only use MorphTargetOffset when the shader enables morph targets.
Change 3848533 by Richard.Wallis
Handle Metal adding FirstInstance into [[ instance_id ]] which is different to other APIs. SV_InstanceID and SV_VertexID should now have their respective base instance and base vertex ID's subtracted before use in the shader.
#jira UE-51716
Change 3848625 by Richard.Wallis
Compile Fix
Change 3848725 by Rolando.Caloca
DR - Remove use of Build/SetLocalGraphicsPipelineState
Change 3848797 by Rolando.Caloca
DR - Deprecate Build/SetLocalGraphicsPipelineState
Change 3849237 by Arne.Schober
DR - AddCustom Ver for ModelVertex Serialization
Change 3851247 by Rolando.Caloca
DR - vk - Util functions
Change 3851523 by Arne.Schober
DR - Update Reflection Comparission shot from the BuildFarm.
Change 3851859 by Rolando.Caloca
DR - vk - Skip loader
Change 3851889 by Krzysztof.Narkowicz
Removed lights with lighting channels out of tiled deferred light list. Tiled deferred lights do not support lighting channels and it's wasn't worth to add extra complexity to this shader in order support this special case.
#jira UE-51512
Change 3852181 by Rolando.Caloca
DR - vk - Linux compile fix
Change 3852547 by Uriel.Doyon
Fixed Pre-Exposure shader compilation and Temporal AA issue.
#jira UE-54276
Change 3852637 by Arne.Schober
DR - Fixing Normal Automated Test Result
Change 3853167 by Richard.Wallis
AvfPlayer - support for streaming media. Due to an operator new/delete mismatch in Apples CFNetwork - we've had to change out one of that framework allocators using rd_route to avoid the memory corruption.
#jira UE-35637
Change 3853447 by Chris.Bunner
Fixing typos.
Change 3853645 by Krzysztof.Narkowicz
Fixed light functions on subsurface materials
Removed strange code from blending between static and dynamic shadows
#jira UE-50275
Change 3853660 by Rolando.Caloca
DR - Fix OpenGL overwriting texture samplers on forward renderer
Change 3853945 by Mark.Satterthwaite
Duplicate #3831616
Fix the black ground scattering on Metal - we've had issues with the atmospheric fog calculations for a long time - one or more intermediate operations generates different precision on Metal so we end up passing -ve values into sqrt which then generates NaN/INF. For Metal when compiling this file and this file only #define sqrt() to sqrt(abs()) so that we don't see anymore unexpected black in atmospheric rendering. This is far from ideal but I don't want to make abs all inputs into every sqrt because AFAIK this is the only case where we have an issue, and until we to investigate each intermediate calculation that isn't ridiculously, soul-crushingly tedious, it isn't practical to identify the source of the error.
#jira UE-53720
Change 3853966 by Mark.Satterthwaite
Duplicate #3835852
Fix tessellation shaders in Metal with Manual Vertex Fetch enabled:
- The control points idnex buffer shouldn't collide with anything else.
- We can't use the optimisation of loading texture width & height from the buffer meta-table in tessellation shaders as the combined stages don't guarantee not to clobber unused buffer slots and screw it up when we use linear textures.
#jira UE-53851
Change 3854250 by Uriel.Doyon
Fix fbx automation tests
Change 3854736 by Uriel.Doyon
Added a tooltip to the EV100 slider in the exposure menu.
Using game settings now disables the slider.
#jira UE-53945
Change 3855047 by Jian.Ru
Fix DFAO getting NANs when samples out of ViewRect
#jira UE-54403
Change 3858197 by Krzysztof.Narkowicz
View frustum shadow caster culling for pointlights/spotlights
#jira UE-54381
Change 3860081 by Krzysztof.Narkowicz
Tighter bounding sphere for a spotlight
Replaced IntersectSphere(LightProxy->Origin, LightProxy->Radius) with LightProxy->SphereBounds for tighter culling of spotlights
Directional light GetBoundingSphere() now everywhere returns Sphere((0,0,0),HALF_WORLD_MAX) for consistency and proper SphereBounds
#jira UE-54258
Change 3860324 by Mark.Satterthwaite
Update the macOS deployment target version to 10.12 from 10.11 as we officially ended support for El Capitan a while ago. Should mean that libraries compiled for 10.12 and up won't cause link warnings.
Change 3860945 by Arne.Schober
DR - Fix not releaseing SRV on render thread for FPositionVertexBuffer, FStaticMeshVertexBuffer, FColorVertexBuffer, FStaticMeshInstanceBuffer.
#jira UE-54587
Change 3861129 by Jian.Ru
Prevent distance culled objects from casting distance field direct shadows
#jira UE-54533
Change 3861502 by Jian.Ru
Exclude distance culled objects from DFAO calculation
#jira UE-54533
Change 3862243 by Krzysztof.Narkowicz
Changed radius of a directional light's bounding sphere from HALF_WORLD_MAX to WORLD_MAX in order to encopass entire WORLD_MAX box
Change 3863476 by Krzysztof.Narkowicz
Added BuildReflections option to ResavePackages commandlet
#jira UE-54581
Change 3863717 by Rolando.Caloca
DR - vk - Missed using pipeline cache on compute PSOs
Change 3865332 by Arne.Schober
DR - Fix UE-52356 Bone Weight
Change 3866220 by Rolando.Caloca
DR - vk - Fixed GetNativeResource missing on textures
- Added support for -preferNvidia|AMD|Intel
- Added VulkanRHIBridge.h
- Minor fixes
Change 3866222 by Rolando.Caloca
DR - vk - Missed file
Change 3866951 by Krzysztof.Narkowicz
Fixed FreezeRendering on non editor builds: ComputeAndMarkRelevanceForViewParallel was calling FrozenMatricesGuard on multiple threads, reading and writing view matrices state in parallel.
#jira UE-53640
Change 3867231 by Guillaume.Abadie
Adds alpha mode to allow the tonemapper to passthrough the alpha channel for broadcast industry.
Change 3867233 by Guillaume.Abadie
Fixes a compilation failures in TAAU with r.PostProcessing.PropagateAlpha==2
Change 3867594 by Daniel.Wright
Removed EditorOnlyDefaultMaterials, which added 79s of shader compilation during startup
Added a dialog when opening the Material Editor on a Default Material, warning of advanced workflow
Preventing Material Editor Apply or Save for a Default Material when the preview material has compilation errors
Change 3870048 by Daniel.Wright
Cleaned up formatting in TranslucentRendering from merges
Change 3870106 by Krzysztof.Narkowicz
Fixed some FArchive Tell()/Seek() 64bit->32bit truncations
Change 3870211 by Rolando.Caloca
DR - vk - Added -vulkanvalidation=N/-vulkanstandardvalidation/-novulkanstandardvalidation to set validation layer behaviour from cmd line
Change 3870225 by Rolando.Caloca
DR - vk - Some platforms do not use a standard swapchain
Change 3870267 by Arne.Schober
DR - SafeRelease SRVs that might be hold by the Vertexfactories (maybe due to indirect use in GlobalResources)
Note that the VFs are not owners of the data, e.g the underlying Buffers might be released before this and this reference counting should be uneccessary
Change 3870647 by Daniel.Wright
Moved FogRendering.h to Renderer
Change 3872130 by Krzysztof.Narkowicz
Disable USE_GLOBAL_CLIP_PLANE for MATERIAL_DOMAIN_POSTPROCESS and MERIAL_DOMAIN_UI
Merging GitHub Pull request #4459
"When material domain is not needing global clip plane there is no need to generate any code involving it. This does not alter output but removes lot of code at vertex shader and pixel shaders. At least on mobile rendered was actually generating clipping code for ui materials."
#jira UE-54616
Change 3872145 by Rolando.Caloca
DR - vk - Optional SupportsMarkersWithoutExtension
Change 3872404 by Uriel.Doyon
Added some guards when streaming virtual textures.
Fixed optimized UCanvasRenderTarget2D::RepaintCanvas() to prevent resolving the texture twice.
Fixed bad mipmap generation with UCanvasRenderTarget2D.
Change 3872507 by Arne.Schober
Back out changelist 3870267
Change 3874176 by Ben.Marsh
IncludeTool: Add an flag to prevent scanning source files for exported symbols.
Change 3874935 by Krzysztof.Narkowicz
Fixed white thumbnails and other issues with sky lighting on ES3_1 path, by disabling GGX prefiltering, as mobile path doesn't have a single cubemap with all initialized mips. Instead it ping-pongs between 2 partially initialized.
#jira UE-54656
Change 3875710 by Daniel.Wright
Renamed uniform buffer member macros to be much shorter for readability
Change 3876665 by Guillaume.Abadie
Cherry-pick 3870715: Implements DOF's hybrid scatering bare bones.
Change 3876666 by Guillaume.Abadie
Cherry-pick 3871786: DOF hybrid scatering: fixes NaN source, transition to gather on close to screen edge and low intensity.
Change 3876677 by Guillaume.Abadie
Cherry-pick 3872348: Implements neighbor comparison for DOF's scattering compilation pass.
Change 3876680 by Guillaume.Abadie
Cherry-pick 3872357: Oups... fixes build...
Change 3876683 by Guillaume.Abadie
Cherry-pick 3872475: Controls number of mip to generate with DOF's reduce pass.
Change 3876687 by Guillaume.Abadie
Cherry-pick 3874104: Fixes various bugs in diaphragm DOF's hybrid scattering.
Change 3876690 by Guillaume.Abadie
Cherry-pick 3874144: Packs multiple DOF scattering group into same draw instance.
Change 3876694 by Guillaume.Abadie
Cherry-pick 3874275: Switches hybrid scattering with indexed indirect draw call to reduce scatter vertex shader invocation.
Change 3876695 by Guillaume.Abadie
Cherry-pick 3874674: Records min and max coc on DOF's setup's draw event.
Change 3876783 by Rolando.Caloca
DR - Static analysis fix
Change 3876845 by Guillaume.Abadie
Implements USceneCaptureComponent::ProfilingEventName
Change 3877197 by Rolando.Caloca
DR - vk - OQ fixes (disabled)
Change 3877428 by Krzysztof.Narkowicz
Merged with tiny tweaks Ansel photography plugin improvements from Adam Moss (GitHub pull request #4426):
-The free-roaming photography camera has new constraints by default, i.e. it can't pass through walls
-Photography session can be started and stopped programmatically, e.g. making it possible to bind photography to an alternative hotkey or button combo. This was an often-requested feature.
-Tweakables and utilities are now exposed through a Blueprint Function Library (rather than direct manipulation of console variables)
-The Ansel photography session UI now exposes some engine effect tweakables as sliders. For example, if the game is using depth-of-field then sliders are made available to allow the photographer to change the focal depth etc. The developer may suppress this behavior through the Blueprint Function Library.
-Letterboxing is now removed during multi-part capture, d'oh.
-Tiled shots are taken at full resolution even if ScreenPercentage < 100
-SSR is enabled during super-resolution shots since Ansel is now better at hiding any ensuing artifacts
-Postprocess settings are frozen at session start to avoid discontinuities during photography, i.e. wandering between postprocess volumes when the camera auto-moves for stereo and 360 shots.
#jira UE-54244
#4426
Change 3879086 by Krzysztof.Narkowicz
Fixed sky/reflection capture (without owner) update - they are now updated only with a correspoding world
Change 3879090 by Guillaume.Abadie
Fixes tones of regressions on diaphragm DOF's recombine passes.
Change 3879198 by Rolando.Caloca
DR - vk - Support for real uniform buffers on Android platforms
Change 3879993 by Krzysztof.Narkowicz
-Fixed int64->int32 FArchive offset truncation in TShaderMap, VertexFactory and TextureDerivedData
-Fixed FSerializationHistory bug, when trying to serialize 0 bytes
#jira UE-43203
Change 3881462 by Guillaume.Abadie
Implements full res DOF's setup pass for cheaper full res gathering in recombine pass.
Change 3881524 by Krzysztof.Narkowicz
Fixed compilation by removing FTickableEditorObject from FPreviewScene
Change 3881724 by Chris.Bunner
Static analysis fix.
#jira UE-54762
Change 3881861 by Rolando.Caloca
DR - vk - Fix layout warning when generating mip chain
Change 3881864 by Rolando.Caloca
DR - Use render passes on HZB
Change 3882236 by Yuriy.ODonnell
IndirectLightingColorScale is now applied to SubsurfaceLighting and DiffuseLighting. Was previously only applied to DiffuseLighting.
#jira UE-42534
#github 3326
Change 3882325 by Guillaume.Abadie
Implements FocusOnly lower gathering pass for Diaphragm DOF's slight out focus temporal stability.
Change 3882340 by Rolando.Caloca
DR - vk - Fix api dump
Change 3882430 by Rolando.Caloca
DR - vk - KHR_maintenance2
Change 3882563 by Rolando.Caloca
DR - Add depth-stencil access mode to PSO initializer
Change 3882929 by Rolando.Caloca
DR - vk - Proper fix for maintenance extension macros
Change 3883087 by Mark.Satterthwaite
Allow disabling VSync in windowed mode for macOS 10.13.4+ and above.
Change 3883597 by Guillaume.Abadie
Collapses full and half res DOF setup passes together.
Change 3883702 by Guillaume.Abadie
Fixes mac's build.
Change 3884747 by Uriel.Doyon
Fix for static analysis warning
Change 3884975 by Rolando.Caloca
DR - vk - Move some platform defines to platform properties
Change 3884988 by Rolando.Caloca
DR - vk - Make an override per platform
Change 3885832 by Rolando.Caloca
DR - vk - Cosmetic change to group similar members
Change 3885891 by Rolando.Caloca
DR - vk - Some _RenderThread functions to avoid stalls
Change 3886044 by Rolando.Caloca
DR - Added RHI api _RenderThread version of
RHICreateTextureReference
RHICreateShaderLibrary
RHICreateRenderQuery
Change 3886560 by Guillaume.Abadie
Fixes strong aliasing on TAAU's fast shader permutation.
This adds a 6th neighbor sampling, and switch AA_TONE ON as TAA does for its fast shader permutation.
Change 3886749 by Guillaume.Abadie
Cherry-pick 3884748: Implements DOF's BuildBokehLUT for diaphragm blades simulation.
Only used in hybrid scattering for now.
Change 3886750 by Guillaume.Abadie
Cherry-pick 3885457: Simulates diaphragm blades' curvature on bokeh.
Change 3886752 by Rolando.Caloca
DR - Fix metal static analysis
Change 3887460 by Uriel.Doyon
Fixed to more static analysis warning.
Change 3888201 by Rolando.Caloca
DR - vk - Added r.Vulkan.SubmitAfterEveryEndRenderPass
- Fixed bad layout on rendering back buffer
Change 3888209 by Rolando.Caloca
DR - vk - Unity compile fix
Change 3888254 by Rolando.Caloca
DR - vk - Fix async texture layout
Change 3888893 by Guillaume.Abadie
Simulates bokeh in DOF's slight out of focus.
Change 3889085 by Guillaume.Abadie
Fixes DOF's reduce pass sampling outside viewport.
Change 3889924 by Rolando.Caloca
DR - vk - Skip seemingly bad validation error
Change 3890573 by Daniel.Wright
Only initialize FDiaphragmDOFGlobalResource in Feature Level 5
Change 3890590 by Arne.Schober
DR - Fix Paper2d crash. When addMesh is called the Vertex and Indexbuffers are nulled out. re-create Dynamic Mesh builder for every Mesh instead.
#jira UE-55063
Change 3890638 by Arne.Schober
DR - Better fix for Paper2d which honors batching
#jira UE-55063
Change 3891099 by Krzysztof.Narkowicz
1.5 texel shadow offset fix inside Manual2x2PCF based on #4485 GitHub pull request
#jira UE-54985
#4485
Change 3891234 by Krzysztof.Narkowicz
Optimized PCF2x2 and PCF3x3 - merged #4494 GithHub pull request
#jira UE-55121
Change 3891407 by Rolando.Caloca
DR - vk - Set vendor id earlier
Change 3891417 by Rolando.Caloca
DR - vk - Missing layout transitions
Change 3891718 by Arne.Schober
DR - Do not recreate one Frame Resource for dynamic draws
#jira UE-55063
Change 3891925 by Yuriy.ODonnell
Fix/workaround for inconsistent preprocessor definitions for NVAftermath that result in FD3D11DynamicRHI class layout mismatch. NVAftermath support is now enabled by default for Win64.
NVAftermath is declared as a private dependency in D3D11RHI. It does not automatically propagate to modules that explicitly include private RHI headers (OculusHMD, OSVR, OSVRInput). This results in NV_AFTERMATH being defined while compiling RHI module and not defined when compiling other modules, causing memory corruption at runtime.
The long-term solution for this and similar issues requires some mechanism for adding transitive module dependencies, so that anyone that depends on D3D11RHI module would automatically also get the NVAftermath. Additionally, private headers should *never* be included directly by external modules.
The short-term solution is to explicitly add NVAftermath dependency to OculusHMD, OSVR and OSVRInput.
Additionally, NV_AFTERMATH is no longer forced by D3D11RHIPrivate.h when it's not defined. This allows catching this kind of mismatch in the future through a compiler warning (C4668).
#jira UE-53065
Change 3891987 by Rolando.Caloca
DR - vk - Support for dedicated allocations
Change 3892339 by Jian.Ru
Fix a crash when tessellation shaders are used in dx12
#jira UE-55127
Change 3892528 by Rolando.Caloca
DR - vk - Update Linux headers
Change 3892867 by Rolando.Caloca
DR - vk - Don't create swapchain if not needed
Change 3893416 by Guillaume.Abadie
Implements bokeh simmulation on foreground and background gather.
Change 3893732 by Chris.Bunner
GetRelevance_Internal should use the immediate parent resource, not the base, as some features are overridden by permutations e.g. UsesWorldPositionOffset.
#jira UE-53404
Change 3893868 by Guillaume.Abadie
Allocates diaphragm DOF's buffers and structered buffer only on supported platforms.
Change 3893917 by Chris.Bunner
Potential fix for CIS.
Change 3893933 by Chris.Bunner
Duplicating CL 2647737 as this is the same issue from that JIRA where accessing game-thread data was being prevented. We don't have this check in UMaterial::GetMaterialResource already, but presumably the UMaterialInstance case was never removed as we've not been calling it until now.
Change 3894218 by Rolando.Caloca
DR - vk - Remove stat counters per draw call, gains 10% CPU on Infiltrator
Change 3894579 by Arne.Schober
RT - Fix assert not in RenderingThread from Triangle Renderer.
#jira UE-55247
Change 3894724 by Rolando.Caloca
DR - vk - New API for batching barriers
Change 3894909 by Arne.Schober
DR - Fix crash in Speedtree wind where Renderdata is unavailable
#jira UE-54544
Change 3895414 by Rolando.Caloca
DR - Add a configurable threshold for SCWs time outs
Change 3896429 by Marcus.Wassmer
Allow variable frame-latency delay in FrameGrabber frames. For performance you want at least a 1 frame delay so you don't sync the GPU to the CPU.
Change 3896495 by Marcus.Wassmer
Set pointer properly
Fix CIS
Change 3897253 by Guillaume.Abadie
Fixes CIS warning in diaphragm DOF
Change 3899179 by Guillaume.Abadie
Implements background hybrid scatter occlusion for diaphragm DOF.
Change 3903654 by Rolando.Caloca
DR - vk - Rework dump layer to allow other layers
Change 3903766 by Rolando.Caloca
DR - vk - More wrappers
Change 3904025 by Rolando.Caloca
DR - vk - More wrappers
Change 3904342 by Rolando.Caloca
DR - vk - Track image resources & callstacks
Change 3904346 by Rolando.Caloca
DR - vk - Copy fix from 4.19 for flickering grass
Change 3904510 by Rolando.Caloca
DR - vk - Compile fix
Change 3904914 by Daniel.Wright
[Integrate] Fixed PS4 transitions with forward shading
Change 3904916 by Daniel.Wright
[Integrate] Fixed PS4 transitions with occlusion queries
Change 3905975 by Rolando.Caloca
DR - vk - Missing wrappers
Change 3905977 by Rolando.Caloca
DR - vk - Missed file
Change 3907829 by Rolando.Caloca
DR - Move depth bounds to the PSO
Change 3907832 by Rolando.Caloca
DR - vk - Prep for delaying transitions
Change 3907834 by Rolando.Caloca
DR - vk - Fix for depth stencil issues/validation errors
Change 3907967 by Rolando.Caloca
DR - vk - Linux compile
Change 3908093 by Rolando.Caloca
DR - vk - Fix depthstencil layout on descriptors
Change 3908393 by Rolando.Caloca
DR - vk - Disable dedicated allocation as it causes crashes on Nvidia 700 series
Change 3908401 by Rolando.Caloca
DR - Do transitions outside render pass
Change 3908422 by Rolando.Caloca
DR - vk - Fix transition state not getting stored
Change 3908735 by Guillaume.Abadie
Cherry-pick 3896619: Fixes after TAAU post process material that had wrong default buffer UV.
#jira UE-55317
Change 3908736 by Guillaume.Abadie
Cherry-pick 3891352: Fixes ensure when visualizing HDR with TAAU.
#jira UE-55019
Change 3908753 by Guillaume.Abadie
Lets the renderer layout the views in the internal render targets like it prefers.
Change 3909119 by Daniel.Wright
Fix some static analysis warnings
Change 3911943 by Rolando.Caloca
DR - vk - Fix for packaging Vulkan projects
Change 3912145 by Rolando.Caloca
DR - vk - Fix layout on streaming textures
Change 3913029 by Rolando.Caloca
DR - Fix missing transition
Change 3913048 by Rolando.Caloca
DR - Fix for hlslcc
Change 3913054 by Rolando.Caloca
DR - vk - Fix number of layers on barrier
Change 3913171 by Rolando.Caloca
DR - vk - Fix for decal missing transition
Change 3913211 by Rolando.Caloca
DR - vk - Add debug name to image tracking
Change 3913449 by Rolando.Caloca
DR - vk - Restore transition
Change 3913466 by Rolando.Caloca
DR - Fix Vulkan EngineTest
Change 3913537 by Rolando.Caloca
DR - vk - Fixes independent samplers & textures (contributed by AMD)
Change 3913548 by Rolando.Caloca
DR - vk - Warning fix
Change 3913691 by Rolando.Caloca
DR - vk - Fixes for parallel (wip)
Change 3914656 by Rolando.Caloca
DR - vk - Fix bug when using separate samplerstates and textures
Change 3914730 by Rolando.Caloca
DR - vk - Bump version
Change 3914764 by Rolando.Caloca
DR - vk - Don't crash on exit
Change 3915532 by Rolando.Caloca
DR - vk - Parallel context fixes
Change 3915589 by Rolando.Caloca
DR - vk - Hoist and rename transition and layout manager class out of the context
Change 3915592 by Rolando.Caloca
DR - Fix gpu marker name
Change 3917607 by Rolando.Caloca
DR - vk - Fix depth bounds on Vulkan
Change 3917609 by Rolando.Caloca
DR - vk - Fix static analysis
Change 3917616 by Rolando.Caloca
DR - Fix D3D11 initialization
Change 3920569 by Rolando.Caloca
DR - vk - Prep for layout mgr refactor
Change 3921023 by Rolando.Caloca
DR - vk - Dump layer fixes
Change 3921623 by Rolando.Caloca
DR - vk - Prep refactor for layouts
- Dump now shows marker tree
Change 3922007 by Rolando.Caloca
DR - vk - Fix extra allocation per draw call
Change 3922442 by Rolando.Caloca
DR - vk - Detect potential issues
Change 3922470 by Rolando.Caloca
DR - vk - Minor optimization
Change 3922482 by Rolando.Caloca
DR - vk - More minor optimizations
Change 3923158 by Rolando.Caloca
DR - Move r.DisableEngineAndAppRegistration out to common RHI and use it on Vulkan
Change 3923486 by Rolando.Caloca
DR - vk - Minor cpu optimizations
Change 3923505 by Rolando.Caloca
DR - vk - Use bigger allocations for uniform buffers
Change 3923516 by Rolando.Caloca
DR - vk - Android compile fix
Change 3923557 by Rolando.Caloca
DR - vk - Cache descriptorset layouts, refactor duplicated code
Change 3923851 by Rolando.Caloca
DR - vk - Linux compile fix
Change 3924153 by Rolando.Caloca
DR - vk - Support for dynamic UBs
Change 3924193 by Rolando.Caloca
DR - vk - Remove old per pso descriptor pools
Change 3924197 by Rolando.Caloca
DR - vk - Remove unused global uniform buffer pool
Change 3924220 by Rolando.Caloca
DR - vk - Wrap some unused classes in their define
Change 3924234 by Rolando.Caloca
DR - vk - Show ring buffer wrapping messages
Change 3924243 by Rolando.Caloca
DR - vk - Fix bad dynamic buffer
Change 3924902 by Rolando.Caloca
DR - vk - Fix crash running infiltrator
Change 3925209 by Rolando.Caloca
DR - vk - Fix bug with dynamic buffers
- Remove old defines
Change 3925300 by Rolando.Caloca
DR - vk - Allow packed uniforms as dynamic UBs (with r.Vulkan.DynamicGlobalUBs)
Change 3925627 by Rolando.Caloca
DR - vk - Move DynamicOffsets into the pipeline state
Change 3925834 by Rolando.Caloca
DR - vk - Cache per stage information
Change 3925835 by Daniel.Wright
Fixed DisplayName for UParticleModuleCollisionGPU
Change 3925897 by Rolando.Caloca
DR - vk - Split update descriptors loop
Change 3926488 by Rolando.Caloca
DR - vk - 16MB for ring buffer on desktop, 8 MB for mobile
Change 3928168 by Guillaume.Abadie
Cherry-pick 3917219: Implements r.DOF.RecombineQuality
Change 3928173 by Guillaume.Abadie
Cherry-pick 3927888: Enables r.DOF.HybridScatter.BackgroundCompositing and r.DOF.HybridScatter.ForegroundCompositing to work when both enabled.
Change 3928216 by Rolando.Caloca
DR - vk - Fix Android
- Fix static analysis
Change 3929119 by Rolando.Caloca
DR - vk - Rename some classes for clarity
- Fix read-only cvar
Change 3929151 by Rolando.Caloca
DR - vk - Rename class
Change 3930046 by Rolando.Caloca
DR - Temp fix Vulkan flickering grass
Change 3930148 by Rolando.Caloca
DR - vk - Only update dirty descriptors
- Use dynamic descriptors for packed global uniform buffers
Change 3930998 by Guillaume.Abadie
Packs shader permutation in different XGE submissions.
Change 3931079 by Rolando.Caloca
DR - vk - Fixes for Android and non-real ubs platforms
Change 3931942 by Krzysztof.Narkowicz
Depth rendering - When EarlyZPassMode is set to DDM_AllOccluders, dynamic objects need also to test bUseAsOccluder just like static ones
#jira none
Change 3932819 by Daniel.Wright
[Integrate] Scene Textures uniform buffer
* Base Pass Uniform Buffer now contains a Scene Textures uniform buffer. Previously the translucent base pass had to check ~40 loose scene texture parameters every draw.
* FMeshMaterialShader's must now bind PassUniformBuffer and supply a valid pass uniform buffer. For most passes this is just FSceneTextureUniformParameters.
* FRendererModule::DrawTileMesh can now cleanly set dummy scene texture resources, just by configuring how the pass uniform buffer is created.
* Moved scene texture shader functions out of Common, into SceneTexturesCommon which must be manually included by shaders that want to use them
* Separate Mobile Scene Textures uniform buffer to silo the platform complexities
Moved DBuffer inputs out of FDeferredPixelShaderParameters and into FOpaqueBasePassUniformParameters
Removed per-frame material uniform expressions. GameTime material node with period is now implemented with an fmod in the shader, without the use of MaterialFloat, so that it will happen at full precision.
* Per-frame expressions were used when the GameTime material node had a period, to do the fmod on the CPU where 32 bit precision is guaranteed, for mobile GPU's where pixel shader precision is sometimes less than 32fp.
Moved forward shading data into the Base Pass Uniform Buffer
Removed instanced stereo support for the light cull grid - will have to be reimplemented without changing SRV's per draw
Base pass sets View Uniform Buffer from DrawRenderState instead of choosing which one to set per-draw
Fixed padding in nested uniform buffer structs
Skip SRV members on Feature Level SM4 and below
Change 3932964 by Rolando.Caloca
DR - vk - Renderdoc on Android
Change 3933095 by Daniel.Wright
Moved FSceneTextureUniformParameters out of the opaque base pass uniform buffer.
* Base Pass shaders now enable SCENE_TEXTURES_DISABLED when compiling for a material of any domain other than MD_Surface. These are used when rendering thumbnails of a material in a different domain, which could be opaque, but the opaque base pass drawing policy does not bind a scene textures uniform buffer, so the shader must not bind it.
* Opaque materials can no longer use EyeAdaptation.
Change 3933096 by Daniel.Wright
Better d3d11 assert message when a uniform buffer was not set by the renderer
Change 3933176 by Rolando.Caloca
DR - vk - Prefer mailbox if available
Change 3933271 by Ryan.Vance
#jira UE-55936
Fixed missing referenced uniform bindings on AR pass-through camera shaders.
Change 3934000 by Guillaume.Abadie
Fixes Win32 build in ShaderCompilerXGE.cpp
Change 3934299 by Guillaume.Abadie
Fixes a bug in DOF's reduce operator that was casusing color leaking between background and foreground.
Change 3934699 by Daniel.Wright
Added bAffectDistanceFieldLighting to landscape
Change 3935190 by Daniel.Wright
Forward Light Grid SRV's use StructuredBuffer on Metal, instead of 'invariant Buffer', which throws off RemoveUniformBuffersFromSource parsing
Change 3935606 by Daniel.Wright
Removed LightmapPolicy::Set which was needed for vertex lightmaps
Renamed FVertexFactory::Set to SetStreams to make it findable
Change 3936510 by Rolando.Caloca
DR - vk - Update glslangValidator.exe to 1.0.65.1 for dumped debug SPIRV shaders
Change 3936545 by Richard.Wallis
Clone of CL's (3925763, 3925430, 3925424, 3925385, 3925278) Mark Satt's Xcode fixes from task stream //Tasks/UE4/Dev-UERNDR-354-mtlpp/
Plus XCode 9.2 compile fix in ApplicationPlatformCompilerPreSetup.h for -Wunused-lambda-capture.
Change 3938061 by Daniel.Wright
Vulkan: Added support for SRV's in Uniform Buffers
Change 3938123 by Daniel.Wright
Vulkan: Slightly better assert for null resources in uniform buffer
Change 3939197 by Rolando.Caloca
DR - vk - Disable custom memory mgmt
Change 3939677 by Rolando.Caloca
DR - vk - Fix static analysis warning
Change 3939809 by Rolando.Caloca
DR - vk - Fixes for async compute
Change 3939875 by Rolando.Caloca
DR - vk - Support for -vktrace
Change 3939977 by Rolando.Caloca
DR - vk - Skip a condition during gather UBs
- Set up efficient compute async var
- Fix validation cmd line
Change 3939982 by Rolando.Caloca
DR - vk - Revert mipchain
Change 3939984 by Rolando.Caloca
DR - vk - Remove unnecessary asserts
Change 3940082 by Rolando.Caloca
DR - vk - Custom mem mgr
Change 3940475 by Rolando.Caloca
DR - vk - Fix DFAO (indirect draw offset)
Change 3940555 by Rolando.Caloca
DR - vk - Minor fixes
Change 3940675 by Rolando.Caloca
DR - vk - Fix indirect type mismatch
Change 3941111 by Rolando.Caloca
DR - Renderpass bGeneratingMips
Change 3941847 by Daniel.Wright
Fixed Volumetric Lightmaps on Static geometry only working if the geometry had been built with Surface Lightmaps before
Change 3941978 by Rolando.Caloca
DR - vk - Minor fixes for presenting on compute queue
Change 3942074 by Rolando.Caloca
DR - vk - Remove some RHI stalls
- Fixed swap chain stat
Change 3943946 by Daniel.Wright
Fixed Texcoord0 on Volume materials on a particle sprite, including SubUV particles.
Change 3944065 by Daniel.Wright
Fixed SceneDepth collision getting broken on GPU particles when a scene capture is rendering
Change 3944158 by Daniel.Wright
Fixed ViewUniformShaderParameters accessing GEngine->PreIntegratedSkinBRDFTexture too early during slate loading screen
Change 3944865 by Rolando.Caloca
DR - vk - Prep for render passes
Change 3945196 by Rolando.Caloca
DR - Move render pass validate to cpp
Change 3945202 by Rolando.Caloca
DR - vk - Some fixes for using real render passes
Change 3945357 by Rolando.Caloca
DR - Fix bad condition
Change 3946295 by Yuriy.ODonnell
Added a sentinel member to FLightMap, which is initialized in the ctor and reset in the dtor. Sentinel is then checked in FLightCacheInterface::GetLightMapInteraction().
This aims to shed some more light on a hard-to-repro crash, which is suspected to be a use-after-free bug: http://crashreporter/Buggs/Show/1785593
Change 3946407 by Rolando.Caloca
DR - vk - Prep for refactor
Change 3946648 by Rolando.Caloca
DR - vk - Fixes for async compute (wip)
Change 3947299 by Rolando.Caloca
DR - vk - FIx static analysis
Change 3948434 by Rolando.Caloca
DR - vk - Fix exiting with parallel
Change 3948928 by Rolando.Caloca
DR - vk - Fix enabling draw markers for tools
Change 3949021 by Rolando.Caloca
DR - vk - Buffer tracking layer
Change 3949602 by Rolando.Caloca
DR - vk - static analysis fix
Change 3949757 by Rolando.Caloca
DR - vk - Remove bogus parameter
Change 3949810 by Rolando.Caloca
DR - vk - Move waits for cmd buffer
Change 3950270 by Guillaume.Abadie
Implements dedicated gather pass for foreground hole filling to avoid being VGPR bound in foreground gather pass, but still being hable to amend foreground.
Change 3950272 by Rolando.Caloca
DR - vk - Minor refactor for semaphores
Change 3950279 by Guillaume.Abadie
Oups... fixes build
Change 3950298 by Rolando.Caloca
DR - vk - Gather wait semaphores in the cmd buffers
Change 3950371 by Rolando.Caloca
DR - vk - fixes for async compute
Change 3950597 by Rolando.Caloca
DR - vk - Fix for clip distance (fixes planar reflections)
Change 3951075 by Rolando.Caloca
DR - vk - Fix for async compute
Change 3952524 by Guillaume.Abadie
Some DOF enum refactoring.
Change 3955016 by Daniel.Wright
Fixed BuiltData package getting renamed into the map package during a content browser folder move, causing a redirector to be incorrectly placed in the map package
Change 3955668 by Guillaume.Abadie
Fixes a bug where full res coc buffer was computed even if not doing slight out of focus.
Change 3956722 by Guillaume.Abadie
Fixes a bug where r.DOF.MaximalForegroundBlurringRadius was screen percentage dependent.
Change 3959212 by Guillaume.Abadie
Prefixes all DOF's shaders files with DOF keyword.
Change 3959705 by Guillaume.Abadie
Optimises the DOF setup pass outputing half res and full res with LDS downsample.
Change 3959941 by Guillaume.Abadie
Halfs DOF's hybrid scatter compilation by using a unique downsampling for both foreground and background, instead of 2 reduce passes.
Change 3962273 by Rolando.Caloca
DR - Fix typos
#jira UE-56317
PR #4586
Change 3962615 by Rolando.Caloca
DR - vk - Compile fix
Change 3962949 by Rolando.Caloca
DR - Fix DOFDownsample extension
Change 3962993 by Guillaume.Abadie
Back out changelist 3962949
Change 3963016 by Guillaume.Abadie
Adds missing DOFDownsample.usf
Change 3963041 by Rolando.Caloca
DR - vk - Misc changes to help integrate
Change 3964293 by Guillaume.Abadie
Fixes DOF's setup pass reading outside of the viewport.
Change 3964475 by Guillaume.Abadie
Collapses DOF's hybrid scatter compilation passes into reduce passes.
Change 3964883 by Daniel.Wright
Fixed 3d texture in uniform buffer on unsupporting RHI
Change 3964897 by Rolando.Caloca
DR - Compile fixes
Change 3964914 by Guillaume.Abadie
Fixes a bug on r.DOF.RecombineQuality=0
Change 3965153 by Guillaume.Abadie
Fixes compile warning in D3D12Commands.cpp.
Change 3965814 by Rolando.Caloca
DR - Prep for integration conflict resolve
Change 3965899 by Rolando.Caloca
DR - Fix odd linkage issue
Change 3966072 by Rolando.Caloca
DR - More prep for merge
Change 3966163 by Rolando.Caloca
DR - Merge prep
Change 3966844 by Guillaume.Abadie
Packs multiple DOF scattered bokeh per instance and uses PT_RectList in DOF for platforms that can.
Change 3967116 by Rolando.Caloca
DR - Compile fixes for integration
Change 3967273 by Rolando.Caloca
DR - Use same path for mip generation
Change 3967277 by Rolando.Caloca
DR - vk - Fix mips on cubemaps
Change 3967693 by Rolando.Caloca
DR - Copying //UE4/Dev-Main@3912313 to //UE4-DevRendering, missing shaders
Change 3967851 by Rolando.Caloca
DR - Copying //UE4/Dev-Main@3912313 to //UE4-DevRendering, Engine 2/2
Change 3968083 by Rolando.Caloca
DR - Integration compile fixes
Change 3968240 by Rolando.Caloca
DR - Shader compile fixes for integration
Change 3968270 by Rolando.Caloca
DR - Fix for missing hash calculation
Change 3969426 by Rolando.Caloca
DR - vk - Fix warning
Change 3969869 by Krzysztof.Narkowicz
Back out changelist 3946295 - UE-54537 is fixed, so no need for this debug sentinel.
#jira none
Change 3969944 by Rolando.Caloca
DR - Warning fix
Change 3970020 by Rolando.Caloca
DR - Bump after integration
Change 3970052 by Rolando.Caloca
DR - Fix for mobile
Change 3970236 by Daniel.Wright
Causing decal shader to recompile to fix a merge bug
Change 3970270 by Daniel.Wright
Bump shader version from merge
Change 3970339 by Olaf.Piesche
Replace series of locks/unlocks with a single one for curve injection
#tests QAGame
Change 3970390 by Rolando.Caloca
DR - Rename FSceneTextureUniformParameters to FSceneTexturesUniformParameters
- Remove duplicate method for occlusion queries
Change 3970523 by Rolando.Caloca
DR - Fix serialization of shaders
Change 3970533 by Arne.Schober
DR - fix for removing the Speed tree wind when the scene gets deleted. The original enque rendercommand requeues the element onto the renderthread although the call already came from the Renderthread and the scene can get lost in between.
#jira UE-56322
Change 3971160 by Guillaume.Abadie
Fixes CompositeEditorPrimtive pass and SelectionOutline pass for VR editor to work with TAAU.
Change 3971516 by Guillaume.Abadie
Cherry-pick 3912629: Fixes SSR that was computing vigneting according to PrevScreen that could let some outside viewport samples going through when rotating the camera.
#jira UE-55353
Change 3971594 by Krzysztof.Narkowicz
Fixed assert inside BindLightMapVertexBuffer. FSplineMeshSceneProxy was calling BindLightMapVertexBuffer for invalid (still not generated) lightmap UV channel after mesh reimport. Simplified assert, as at the moment almost all of the high callsites already clamp lightmap uv channel.
#jira UE-56321
Change 3971622 by Krzysztof.Narkowicz
Fixed crash inside Indirect Lighting Cache. Data (reflection captures and lightmap) generation calls ULevel::GetOrCreateMapBuildData(), which can destroy lightmap data if level has legacy data. Last Lightmap generation step recreates this data, but if user cancels lightmap generation - it won't do that.
#jira UE-56171
Change 3974788 by Rolando.Caloca
DR - Remove GSupportsGenerateMips
Change 3974789 by Rolando.Caloca
DR - Remove bogus function
Change 3974986 by Rolando.Caloca
DR - vk - Tracking fixes
Change 3974989 by Rolando.Caloca
DR - vk - Don't submit dummy barriers
Change 3975075 by Olaf.Piesche
Update for particle curve injection improvement, fixing ES2 problems
#tests QAGame tm-shadermodels, various color curve tests in-editor
Change 3975957 by Uriel.Doyon
Fixed invalid max texture resolution when using the bake material tools.
Change 3978471 by Daniel.Wright
New cvar r.SkylightUpdateEveryFrame
Change 3978779 by Rolando.Caloca
DR - Accessor for texture sizes
Change 3978797 by Rolando.Caloca
DR - Clean up RHI CopyTexture API
Change 3978832 by Rolando.Caloca
DR - vk - Workaround for RenderDoc crashing due to Descriptor Pool reset
Change 3978836 by Rolando.Caloca
DR - vk - Remove generate mips
Change 3979201 by Rolando.Caloca
DR - vk - RHI CopyTexture. Uses general layout for generating mips
Change 3979204 by Rolando.Caloca
DR - Use render passes and CopyTexture to generate mips
Change 3979592 by Rolando.Caloca
DR - Warning fix
Change 3980855 by Krzysztof.Narkowicz
Optimize bounding sphere radius after non-uniform scale by using bounding box extent.
#jira UE-56227
Change 3981065 by Rolando.Caloca
DR - vk - Fix bad layout
#jira UE-56238
Change 3981346 by Rolando.Caloca
DR - Copy from 3707257
Support for not flushing compute jobs (r.D3D11.UAVFlushNV)
Change 3981347 by Rolando.Caloca
DR - Copy from 3707257
Don't flush between morph dispatched
Change 3981932 by Mark.Satterthwaite
Generate the shader hash and function name when a Metal shader error needs to be reported so that even without shader code we get something to go on.
Change 3982442 by Rolando.Caloca
DR - Fix warning
Change 3982652 by Rolando.Caloca
DR - vk - Signal semaphore cleanup
Change 3983917 by Richard.Wallis
Clone of CL 3974146 converted for mtlpp along with extra mtlpp usage suggestions by Mark Satt:
Fix for black flickering on first paint with weighted material landscape on Mac. When using AsyncCopyFromBufferToTexture in Metal we put the blit operation on the prologue encoder - however after a draw call using that resource the copy operation should happen after on the current encoder, this keeps the correct order of operations.
Added Bool return from various Asnyc renderpass resource requests so caller can decide correct further action. Updated to include the other async functions.
Change 3984409 by Guillaume.Abadie
Attempts to make static analysis happy again.
Change 3984435 by Nick.Bullard
Checking in Performance Test level provided to us by Tor Frick based on UE-44841.
This has been utilized for checking issues against Aftermath performance impact.
The Map includes 2 Level Book marks, most testing has been done against Bookmark 1 view, in fullscreen, in game mode
Change 3985087 by Mark.Satterthwaite
Make sure that the particle scratch buffer is large enough to hold all the data for the curve texture we are rendering to, otherwise a full set of curves will start scribbling memory after 64Kb (the curve texture is 256Kb of data - 512x512x4 as sizeof(RGBAUInt8) == 4). This happens in ElementalDemo.
Change 3985201 by Rolando.Caloca
DR - Fix bad CopyTexture
Change 3985258 by Mark.Satterthwaite
Try and detect orientation changes so that we don't blow-up on iOS due to a huge mismatch between the drawable texture for the display and the scene's depth-stencil target. I can't just fiddle with the depth-stencil texture itself without running the risk of obliterating in-use data and really we shouldn't permit such a mismatch anyway but it is fallout from 3620990.
#jira UE-55756
Change 3986449 by Rolando.Caloca
DR - vk - Update & consolidate Vulkan headers to 1.1.70.1
Consolidate SDK into one
Change 3986571 by Guillaume.Abadie
Makes PVS-Studio happy again in DOF.
Change 3987039 by Yuriy.ODonnell
Initial implementation of tracing profiler to show CPU and multiple GPUs on the same timeline. Currently only supported on DX12 platforms.
Use `TracingProfiler frames=N` console command to trigger a capture of the next N frames. Trace is saved to disk as a JSON file into `Saved/Profiling/Traces` directory.
Trace file uses Google Tracing format and can be visualized in Chrome built-in profiler (chrome://tracing).
`r.GPUStatsChildTimesIncluded=1` CVar makes timing scopes hierarchical.
`TracingProfiler.BufferSize=N` CVar controls the size of the tracing buffer, which may need to be increased for long traces (default is 65k events). Only can be set at startup.
Change 3987074 by Yuriy.ODonnell
Implemented timestamp calibration on DX11. Calibration is only performed when tracing profiler session starts.
Change 3987160 by Yuriy.ODonnell
Added thread naming and ordering to the tracing profiler output
Change 3987331 by Mark.Satterthwaite
Remove the Nvidia hack to retain resource references in command-buffers for UE-46604 as the mtlpp refactor provides stronger resource lifetime guarantees.
#jira UE-46604
Change 3987754 by Mark.Satterthwaite
Fix MetalRHI memory reporting in non-default path.
PR #4568
Change 3988184 by Arciel.Rekman
Linux: Fix editor OpenGL performance (UE-55960).
- GetCurrentThreadId() calls became much more frequent with the OpenGL RHIT refactor.
- We used to only cache that value in monolithic builds, because having per-thread static variables in dynamic libraries is risky due to OS limits.
- This change adds dynamically-managed per-thread cache for non-monolithic builds.
#jira UE-55960
Change 3988394 by Rolando.Caloca
DR - vk - Improve memory mgmt
- Use 256MB pages for Device heap (or 1/8th if less).
- Remove texture allocations not going through resource manager
Change 3988405 by Marcin.Undak
Fix VulkanQuery crash on exit #codereview rolando.caloca #codereview arciel.rekman #rb arciel.rekman
Change 3988567 by Rolando.Caloca
DR - vk - Support for packed global UBs on pci aperture heap
Change 3988668 by Rolando.Caloca
DR - vk - Remove old comments
Change 3988956 by Marcin.Undak
RecordPerformance: added option to skip building/cooking before tests #rb none #codereview arciel.rekman
Change 3989161 by Yuriy.ODonnell
Static analysis error fix
Change 3989196 by Guillaume.Abadie
Fixes a crash in light shaft's TAA pass.
#jira UE-57366
Change 3989207 by Yuriy.ODonnell
Refactored FRealtimeGPUProfilerFrame to avoid splitting profile events when calculating exclusive times of scopes. This allows tracing profiler to retain the hierarchical view of the data, while keeping CSV and GPU Stat system behavior intact.
Change 3989469 by Rolando.Caloca
DR - vk - Fix for bad index; fix for bad transition
Change 3989772 by Yuriy.ODonnell
Implemented timestamp calibration on Vulkan
Change 3990040 by Marcus.Wassmer
Aftermath enabled by default.
Removed unnecessary warning for other vendors
Change 3990064 by Mark.Satterthwaite
Ensure that packed globals are reuploaded when the command-encoder is restarted - don't simply invalidate the existing parameters. This properly handles cases where a single logical render-pass is broken into multiple command-encoders and/or command-buffers - otherwise all shaders must reset all parameters each time. When we move between frames we *do* want to perform a full state reset though as previous frame globals are treated as invalid.
Change 3990080 by Mark.Satterthwaite
Change the way we invalidate the visibility buffer between command-buffers and command-encoders so that on iOS you can reuse the same buffer within the same command-buffer, but not across more than one. The code provides an exception to this rule when running under the MetalRHI validation tools which can break each draw call into its own buffer.
Change 3990084 by Mark.Satterthwaite
Get MetalStatistics compiling again.
Change 3990381 by Arciel.Rekman
Bring back D3D12 in RecordPerformance.
Change 3991113 by Rolando.Caloca
DR - Fix crash on RHI thread on mobile preview
- Check RHI objects are not null in the PSO initializer
Change 3991191 by Ryan.Vance
#jira UE-55952
Reimplemented instanced stereo for forward lighting cull grid after the srv/ub clean up.
Change 3991343 by Rolando.Caloca
DR - Copy from 3911492
UE4 - Disabled parallel mobile bass pass by default. This is experiemental and not known to be useful on any mobile platform.
Change 3991375 by Mark.Satterthwaite
Proper copyright assignment in the mtlpp debugger header.
Change 3993151 by Daniel.Wright
Fix RTDF resource transition found by Rolando
Change 3993818 by Rolando.Caloca
DR - Missed file
Change 3993923 by Krzysztof.Narkowicz
Fixed crashes inside RemoveSpeedTreeWind() and RemoveSpeedTreeWind_RenderThread().
FStaticMeshComponentRecreateRenderStateContext didn't flush deferred render updates causing stale RenderData to be left:
1. Thumbnail manager called SetStaticMesh(nullptr), which added StaticMeshComponent to deferred render updates.
2. UStaticMesh::Build called FStaticMeshComponentRecreateRenderStateContext and destroyed DenderData, but didn't touch Thumbnail's manager StaticMeshComponent as it was nullptr.
3. This resulted in a StaticMeshComponent with stale RenderData pointer.
#jira UE-54544
Change 3994033 by Rolando.Caloca
DR - vk - Reworked layers & extensions, as we were not doing it properly
- Remove -vulkanstandardvalidation and -novulkanstandardvalidation as they are not needed anymore
Change 3994275 by Mark.Satterthwaite
Change to linking against mtlpp via AddEngineThirdPartyPrivateStaticDependencies and marking its header with THIRD_PARTY_* macros in the vain hope that might convince the remote compilation code to distribute the module to the remote machine when building MetalRHI.
#jira UE-57507
Change 3994365 by Mark.Satterthwaite
Pilfer some code from the old MetalHeap file to handle calculating texture memory size on older macOS and iOS builds when running with stats or LLM enabled.
#jira UE-57513
Change 3994382 by Rolando.Caloca
DR - vk - Some missing locks during image tracking
Change 3994422 by Rolando.Caloca
DR - vk - Remove bogus shader format
Change 3995530 by Rolando.Caloca
DR - vk - Fix for crash when validation is enabled
Change 3995531 by Rolando.Caloca
DR - vk - Fix static analysis
Change 3995532 by Rolando.Caloca
DR - vk - Added support for r.Vulkan.SaveValidationCache
Change 3995610 by Uriel.Doyon
Texture Streaming Changes and Fixes:
- Using the small FOV items (like scopes) now only affect visible primitives (through "r.Streaming.MaxHiddenPrimitiveViewBoost").
- Static components added after the level is registered in the streaming manager are now handled correctly (fixes the low quality on the chests)
- Dynamic components do not need to register to the streaming manager anymore.
- Optimized dynamic component management by removing duplicate entries in the update list.
- Added a pregarbage collect pass to the dynamic component management to optimize GC handling.
- Added a budget reset logic whenever the scene requirements change significantly.
- PIE worlds now have correct visibility information.
- Fixed possible invalid memory access when processing the streaming manager slave views.
- Refactored the incremental level texture data build to prevent new components from being unhandled.
- Removed StreamingManager callbacks for NotifyActorSpawned() and NotifyPrimitiveAttached()
- Added a StreamingManager callback NotifyPrimitiveUpdated(), to be used whenever a primitive streaming state must be updated.
#jira none
Change 3995908 by Arciel.Rekman
Fix compile errors when using new Vulkan queries.
Change 3995990 by Arciel.Rekman
More compile fixes to new Vulkan queries.
- MSVC did not catch this, clang did.
Change 3996101 by Rolando.Caloca
DR - vk - Win32 compile fix
Change 3996323 by Mark.Satterthwaite
Use the right include path to export the mtlpp headers.
#jira UE-57507
Change 3996392 by Arciel.Rekman
Vulkan: fix crash on start when using new queries.
- CommandBufferManager was not yet set at that point and the code in queries relied on it.
Change 3996585 by Rolando.Caloca
DR - Slight improvement to GL being black, but just a temporary 'workaround' as it's not correct.
Change 3998806 by Arciel.Rekman
Fix Linux build (UE-57602).
#jira UE-57602
Change 3998866 by Arciel.Rekman
SubwaySequencer: fix old shader platform name.
Change 3998947 by Mark.Satterthwaite
Silence deprecation warnings in CEF on macOS now that we've moved to 10.12 as the minimum.
#jira UE-57577
Change 3998951 by Mark.Satterthwaite
Fix last of the deprecation errors that I am aware of for macOS 10.12.
#jira UE-57581
Change 3998984 by Mark.Satterthwaite
Build mtlpp for iOS 9.0 not 9.3.
#jira UE-57586
Change 3999065 by Rolando.Caloca
DR - vk - Make sure we use version 1.0.0
#jira UE-57521
Change 3999071 by Arne.Schober
DR - [UE-55433, UE-57361] Hack SNORM support in OpenGL by re-interpreting UNORM. Underlying data is always SNORM.
#jira UE-55433, UE-57361
Change 3999494 by Rolando.Caloca
DR - Enable r.UnbindResourcesBetweenDrawsInDX11 in debug
- Clear compute resources when r.UnbindResourcesBetweenDrawsInDX11 is enabled
Change 4000197 by Krzysztof.Narkowicz
Mesh simplifier - normalize TexCoordWeights using min/max TexCoord range. This fixes precision issues for very big TexCoord values and allows to optimize for all TexCoord channels when channels have values of different magnitudes (e.g. non standard TexCoord data).
#jira UE-54935
Change 4000305 by Yuriy.ODonnell
Suppress PVS Studio warning V547 (Expression is always true) related to Aftermath
Reported issue to PVS team and to NVIDIA. Confirmed false positive, fix coming in future PVS version (v6.24).
#jira UE-57579
Change 4000853 by Arciel.Rekman
Linux: fix not calling CrashReportClient (UE-57678).
#jira UE-57678
Change 4001504 by Rolando.Caloca
DR - vk - Fix transition
Change 4002460 by Krzysztof.Narkowicz
Toggle for contant shadow length in word space
Exposed contact shadows to Blueprints
#jira none
Change 4002608 by Rolando.Caloca
DR - vk - Fix static analysis
- Fix potential debug image tracking crash
- Comment out unused methods
Change 4002615 by Rolando.Caloca
DR - vk - Allow r.Vulkan.WaitForIdleOnSubmit to be set at startup (e.g. in ConsoleVariables.ini)
Previously, if your map needed to UpdateSkyCaptureContents on startup, an ensure would fail if GWaitForIdleOnSubmit was set.
PrepareForCPURead needs to wait for the command buffer to finish before trying to read the results back, but the wait has already happened when r.Vulkan.WaitForIdleOnSubmit is set. Trying to wait again correctly complains that the command buffer is not in the correct state. So, skip the WaitForCmdBuffer call when r.Vulkan.WaitForIdleOnSubmit is set.
Change 4002640 by Rolando.Caloca
DR - vk - Missing support for CVarDefaultBackBufferPixelFormat
Change 4002919 by Guillaume.Abadie
Implements DOF's temporal upsampling pass for better dynamic resolution stability.
Change 4002984 by Guillaume.Abadie
Integrates Sebastian Aaltonen's ALU optimisations for TAAU.
Change 4003112 by Olaf.Piesche
Fir for TBB stall (resulting in severe hitches and hangs in the editor with stats active); tested multiple scenarios and encountered no hitches.
#tests QAGame PerformanceTest and RenderTest map with various stats on and off
Change 4003159 by Mark.Satterthwaite
Undo parts of changelist 3970553 - the ref-counted pointer approach to returning textures to the pool is not working as expected so we'll remove that. It'll be faster on the CPU without it and everything works thanks to the changes this CL made to the way textures were released.
#jira UE-57538
Change 4003287 by zachary.wilson
Adding reflection capture content to TM-LightingScenarios
Change 4003395 by Arne.Schober
DR - Fix unitzialised value when clicking Go To in the editor
#jira UE-57048
Change 4003425 by Rolando.Caloca
DR - vk - Fix for new occlusion queries
Change 4003530 by Arne.Schober
DR - Disable GPU Benchmark in headless configurations
#jira UE-57673
Change 4003717 by Rolando.Caloca
DR - vk - Fix for depth not store, stencil store
Change 4003719 by Rolando.Caloca
DR - Minor switch to render pass
Change 4003720 by Mark.Satterthwaite
Don't suballocate private memory buffers on Vega and only Vega as there is something wrong with the blits in those cases but I can't capture a GPU trace to find out what right now (the driver is broken) - could be a bug in my code but this works on Polaris and Nvidia so it will need to be filed as a radar for AMD.
Remove the FMetalBufferChunk from FMetalBuffer and simply store a pointer to the owning Heap/Magazine allocator. The FMetalResourceHeap now calls a new Release function to return the buffer to the allocator which will be faster on the CPU.
#jira UE-57659
Change 4003854 by Mark.Satterthwaite
Undo parts of 3990064 and try a different approach to get the uniforms to upload and remain available in the right places. As the original bug has been lost to time we should keep an eye out for missing buffer bindings by running under the Metal validation layer periodically.
#jira UE-57576
Change 4004709 by Rolando.Caloca
DR - Support for D3D 11, 12 & Vulkan for UAVs off Index Buffers
Change 4005149 by Guillaume.Abadie
Adds shader permutation to avoid clamping input buffer UV in DOF's gather pass.
Change 4005284 by Uriel.Doyon
Resaved volume texture assets with proper engine version.
#jira UE-57534
Change 4005286 by Guillaume.Abadie
Reduces constant setup in DOF's gather pass.
Change 4005359 by Rolando.Caloca
DR - vk - Fix annoying warning
Change 4005363 by Rolando.Caloca
DR - Fix android not finding vulkan shaders
Change 4005457 by Rolando.Caloca
DR - vk - Fix swapchain crash
Change 4005473 by Patrick.Kelly
UE-57135: Editor crash if set Reflection Capture Resolution to be 64 and New a Default level
Codde by Daniel
Tested by Patrick
Change 4005474 by Rolando.Caloca
DR - vk - Remove glsl code from shaders. Packaged QAGame goes from 176MB to 162MB
Change 4005759 by Krzysztof.Narkowicz
Fixed a bug, where reflection capture build is called, even though we are in mobile preview mode.
#jira UE-57743
Change 4005774 by Mark.Satterthwaite
Update the wave intrinsics to avoid implicit bool->uint conversion that Apple don't like.
#jira UE-57750
Change 4005974 by Mark.Satterthwaite
Don't use cubemap array types on iOS Metal as they aren't available on all devices and we need to maintain backward compatibiliy for years to come.
#jira UE-57083
Change 4006056 by Mark.Satterthwaite
Remove the use of the PrimitiveType argument from Metal draw calls.
#jira UE-57822
Change 4006139 by Mark.Satterthwaite
- Move the render-pass functions into the MetalRHI implementation for later alteration.
- Implement Index buffer UAVs for Metal - makes them more like vertex-buffers so this is one more step on the road to a unified buffer base-class implementation.
Change 4006215 by Mark.Satterthwaite
Metal's begin & end render/compute pass API implementation will take some time, but for now make it not depend on the parent stub implementation.
Change 4006394 by Mark.Satterthwaite
In lieu of a real instruction count just use the number of lines in the "Main" function of the shader as the instruction count for Metal.
#jira UE-57551
Change 4006493 by Mark.Satterthwaite
MetalRHI can currently support 4-component formats for Buffer UAVs - this might need some thought in the future as the API evolves but we might as well take advantage while we can.
Change 4006495 by Daniel.Wright
Integrate from Refactor branch
* New FMaterialRenderProxy function GetMaterialWithFallback which provides both the FMaterialRenderProxy and FMaterial. Needed when falling back to default material, so that proxy and material resource match.
* Local vertex factory uniform buffer
Change 4006851 by Brian.Karis
Fix for joined charts forming an L to inflate both axii.
Thanks to Jess Kube of The Coalition.
Change 4006852 by Brian.Karis
Fix for hard coded reflection capture cube map size. Should fix light static light aliasing in captures
Change 4006918 by Brian.Karis
New ByteBuffer functionality. Memcpy and scatter upload. Can implement GPU side TArray reflection.
Not yet used by checked in code. WIP optimization.
Change 4007246 by Guillaume.Abadie
Creates lower quality permutation for DOF's gathering pass, without Coc based weighting of the samples, and lower number of gathering ring for fast accumulator.
Change 4007291 by Guillaume.Abadie
Exposes more DOF scalability settings.
Change 4007328 by Guillaume.Abadie
Optimises DOF's half res only setup pass using gather4
Change 4007627 by Richard.Wallis
Fix for when Magic Mouse cannot zoom in World Composition editor. Missing default SNodePanel::OnMouseMove behaviour. Tested using a classic 2xbutton + wheel mouse and a Mac MagicMouse.
#jira UE-57030
Change 4007682 by Richard.Wallis
No video when playing HLS streaming video on Mac. 2 Issues, FPS was zero making duration for video sample buffer nonsense and Video Track dimensions were going to zero on the AVAsset once fully initialized when playing HSL streams. Now cache relevant details and handle zero frame rate.
Notes:
- Caching the frame rate is not as important as we could look it up each time and fix for zero - ignoring that at the moment.
- Assume we DO NOT want the FrameSize to be the last fetched video frame size from the AvfMediaVideoSampler as I think that is the video quality for streaming video and not the media frame size.
- Renamed a variable in the AvfMediaVideoSample - was called FrameRate but it was the FrameDuration by that point.
#jira UE-56734
Change 4007731 by Rolando.Caloca
DR - Disable byte buffers on non-hlsl based platforms
#jira UE-57851
Change 4007741 by Rolando.Caloca
DR - Disable byte buffers on hlslcc platforms
Change 4007782 by Mark.Satterthwaite
Force Metal shaders, including the stdlib, to recompile.
Change 4007918 by Rolando.Caloca
DR - vk - Some static asserts
Change 4008404 by Arciel.Rekman
Do not crash on incompatible Vulkan drivers (UE-57521).
#jira UE-57521
Change 4008442 by Daniel.Wright
Better comments on ERHIFeatureLevel expectations
Change 4008494 by Arne.Schober
DR - moved bDeletedThroughDeferredCleanup before begincleanup to catch cases where the reference is added twice to the array. also removed finishcleanup as all they ever did was deleting the pointer anyway, and it sould be adfded if such functionallity is ever required fom outside of the regular destructor.
#jira UE-57754
Change 4008730 by Mark.Satterthwaite
After the most recent changes to handling uniform buffer dirty bits in MetalRHI we should guard against attempts to set an unbound uniform buffer.
#jira UE-57870
Change 4008949 by Brian.Karis
Fix compile warning
Change 4008951 by Brian.Karis
Added LTC LUT textures
Change 4009326 by Guillaume.Abadie
Compiles out DOF's gathering bokeh simulation on platform other than desktop.
Change 4009380 by Krzysztof.Narkowicz
Moved area light code before the contact shadows, so contact shadows use representative light's direction.
Merged all contact shadows shader code.
Contact shadows keep constant screen space length independent of FoV settings.
Contact shadows for translucents.
Contact shadows for eye.
Change 4009555 by Guillaume.Abadie
Splits DOFCocTile.usf in two.
Change 4009999 by Yuriy.ODonnell
MallocStomp can now be enabled on certain platforms using '-stompmalloc' command line argument.
Previously it was necessary to modify MallocaStomp.h and re-compile the engine.
Currently supported platforms: Win64, Mac, Linux.
Replaced hard-coded page size with FPlatformMemory::GetConstants().PageSize.
Change 4010288 by Rolando.Caloca
DR - vk - Fix for vertex streams
Change 4010289 by Krzysztof.Narkowicz
D3D12 - fixed depth bounds bug, where depth bounds wasn't properly set to [0;1] after disabling.
#jira UE-57510
Change 4010297 by Rolando.Caloca
DR - vk - Remove some functions for android
Change 4010315 by Rolando.Caloca
DR - vk - Remove create info macro
Change 4010451 by Rolando.Caloca
DR - vk - Reuse samplers
- Infiltrator goes from 5759 to 24 samplers!
Change 4010627 by Rolando.Caloca
DR - vk - Fix missing values for tracking swapchain validation
Change 4011924 by Guillaume.Abadie
Implements tile based early return optimisation on DOF's postfiltering method.
Change 4011941 by Guillaume.Abadie
Shaves some ALU in DOF's accumulator for LowQuality permutation.
Change 4012093 by Yuriy.ODonnell
Disable MallocStompOverrunTest() in static analysis config, as it intentionally performs an out-of-bounds access.
Change 4012195 by Rolando.Caloca
DR - vk - Fix for mobile backbuffer layout
Change 4012202 by Rolando.Caloca
DR - vk - Don't use staging buffers on UMA
Change 4012467 by Rolando.Caloca
DR - Remove redundant check
Change 4012486 by Rolando.Caloca
DR - Fix missing transition
Change 4012518 by Guillaume.Abadie
Implements fast shader permutation for DOF's TAA pass.
Change 4013084 by Arciel.Rekman
Fix for Linux clock discrepancy.
- Causing at least one precision issue, possibly more.
(Edigrating 4003273, 4012462 from //UE4/Dev-Editor/... to //UE4/Dev-Rendering/...)
Change 4013266 by Uriel.Doyon
Fixed crash when setting SceneDepthTextureNonMS and not having valid depth buffers in the SceneContext.
Change 4013626 by Uriel.Doyon
Fixed crash in the lighting build when creating a blueprint of the ALight and placing a light component in it.
#jira UE-51672
Change 4013805 by Rolando.Caloca
DR - Fix more missing transitions
Change 4014128 by Arne.Schober
DR - Do not create LocalVFUniformBuffer when running without MVF
#jira UE-57929
Change 4014193 by Uriel.Doyon
Editing component transforms now invalidate the component's lighting cache.
#jira UE-48134
Change 4014282 by Rolando.Caloca
DR - vk - Remove extra validation during dump
Change 4014584 by Uriel.Doyon
Duplicated static meshes now generate a new GUID to prevent possible issues with lightmass.
#jira UE-49064
Change 4014604 by Uriel.Doyon
UStaticMesh postduplicate now only generates a new GUID if !bDuplicateForPIE.
Change 4015460 by Guillaume.Abadie
Composes separate translucency within DOF's recombine pass.
Change 4015571 by Guillaume.Abadie
Refactors tonemapper to use global shader permutation API, that adds permutation for HDR output device rather than dynamic branching that some shader compiler are not very well optimizing.
Change 4015984 by Krzysztof.Narkowicz
Fixed crash inside DFAO resource allocation, when DFAO viewport has zero area.
#jira UE-58000
Change 4016056 by Mark.Satterthwaite
Fix Mac Metal shader compilation of texture cube arrays.
Change 4016062 by Richard.Wallis
Convert things like Space, Delete, F6 etc to unicode so they display correctly on the Mac menu rather than first letter of word. Added the default Mac commands to the GenericCommands so we get a Chord overwrite message and stop things like cmd+ q / w / h from getting bound.
#jira UE-46999
Change 4016109 by Mark.Satterthwaite
One unified Metal buffer implementation - will make further changes a heck of a lot easier.
Change 4016221 by Patrick.Kelly
UE-57617: Ensure changing viewmode to ShaderComplexity while in -game
Change 4016238 by Guillaume.Abadie
Makes clang happy again in Tonemapper.
Change 4016309 by Mark.Satterthwaite
More *_RenderThread implementations for MetalRHI.
Change 4016414 by Mark.Satterthwaite
And MetalRHI version of CreateStructuredBuffer_RenderThread...
Change 4016498 by Mark.Satterthwaite
Don't hold on to the uniform buffers bound to the hull shader when switching to a tessellated draw call as they'll have the wrong buffer layout.
#jira UE-57930
Change 4017394 by Juan.Canada
OpenGL: Fixed shading artifacts due incorrect UNORM/SNORM conversions in skin/skincache/computetangent shaderss.
#jira UE-57691
Change 4017522 by Rolando.Caloca
DR - vk - Remove unused code path (old mip generation detection)
Change 4017539 by Rolando.Caloca
DR - vk - Fix for sky lighting mips showing green on AMD
Change 4017542 by Arciel.Rekman
Moved appCountTrailingZeros to a non-SSE header (fixes ARM64 build).
- Arguably WITH_SLI shouldn't apply to Linux on ARM but the fact that the function wasn't available is bad on its own.
Change 4017827 by Guillaume.Abadie
Optimises DOF's scattering cost by a third.
Change 4017835 by Rolando.Caloca
DR - Only allow a render pass to generate mips for one color render target
Change 4017889 by Mark.Satterthwaite
Cache all the Metal state objects to avoid hitting the API unnecessarily.
Change 4018251 by Mark.Satterthwaite
Fix broken rendering on Metal that tracked back to the innocuous looking changes in CL #4006495 (no blame attached - these changes are entirely reasonable) and cause various bugs in QAGame's TM-DistanceFields, ElementalDemo and probably more. Doesn't fix broken SpeedTree rendering :(.
MetalRHI was allowing uniform buffers to blow away linear texture buffers when the constant buffer has been elided due to dead-code elimination. This problem can manifest without linear textures if the uniform buffer contains both constant data and a resource-table but the shader doesn't use any of the constant data. That's because Metal doesn't separate constant buffers from any other kind of buffer unlike D3D which separates all the slots out - and Metal doesn't provide enough buffers to emulate the D3D arrangement. So far this has only manifested in the MVF + Linear Texture case but a more robust solution will be necessary long term.
Change 4018514 by Guillaume.Abadie
Implements r.DOF.Scatter.MinCocRadius.
Change 4018553 by Guillaume.Abadie
Implements r.DOF.Scatter.MaxSpriteRatio to control the budget upperbound of DOF's scattering
Change 4020369 by Yuriy.ODonnell
Disable MallocStompOverrunTest in all static analysis configs (using USING_CODE_ANALYSIS macro)
Previously was only disabled for PVS-Studio.
Change 4020620 by Arciel.Rekman
Fix XboxOne CIS (fallout of appCountTrailingZeros move).
Change 4020949 by Guillaume.Abadie
Configures DOF in scalability settings.
Change 4021593 by Rolando.Caloca
DR - vk - Support for Aftermath style api on AMD
Change 4021740 by Rolando.Caloca
DR - vk - Change log output
Change 4022008 by Uriel.Doyon
Fixed renderthread stalls when streaming texture mips on low end systems.
Change 4022135 by Rolando.Caloca
DR - vk - Fix last mip's layout during mip chain creation
Change 4022607 by Jian.Ru
Speculative fix for a bug where an invalid vertex buffer is deferenced
#jira UE-56229
Change 4022890 by Rolando.Caloca
DR - Fix reference count not getting released
Change 4023540 by Mark.Satterthwaite
Avoid some pointless retain/release calls on Metal Encoders.
Change 4023796 by Marcus.Wassmer
Tell users they are over the maximum size when allocating very large rendertargets.
Change 4025337 by Yuriy.ODonnell
Improved use-after-free detection mechanism and physical memory usage of MallocStomp on Windows.
MallocStomp on Windows will now reserve virtual address space for every allocation and then commit physical pages only to the valid usable part.
Physical pages will be unmapped on Free, but virtual address space will not be released and therefore will never be re-used.
Virtual address space is allocated from the OS in blocks of 1GB and then linearly sub-allocated.
This reduces VA space usage, as VirtualAlloc returns blocks on 64KB granularity even if we just need 4KB. As a small bonus, this also reduces number of syscalls per allocation.
This dramatically increases accuracy of use-after-free detection, but consumes significant amount of memory for the OS page table.
Virtual memory limit for a process on Win10 is 128 TB, which means we can afford to keep virtual memory reserved for a long time.
Running Infiltrator demo consumes ~700MB of virtual address space per second.
Additionally, committing physical pages only for the usable part of the entire virtual block reduces physical memory usage by ~30% compared to old behavior,
which allocated and committed entire block of pages via BinnedAllocFromOS and then marks border page as non-accessible.
Change 4026047 by Rolando.Caloca
DR - Fix test/shipping
#jira UE-58148
Change 4026150 by Krzysztof.Narkowicz
Force proper ordering of buffer visualization materials - after tonemapping (so exposure doesn't influence it) and before editor stuff like icons.
#jira UE-57992
Change 4026226 by Rolando.Caloca
DR - Fix static analysis
#jira UE-58150
Change 4026354 by Jian.Ru
Debug check trying to catch a crash. Only enabled in editor build
#jira UE-50111
Change 4026655 by Rolando.Caloca
DR - Fix for static analysis
#jira UE-58149
Change 4026763 by Rolando.Caloca
DR - Remove references to defunct CCT to avoid confusing licensees
Change 4027167 by Uriel.Doyon
Fixed possible out of bound buffer access when serializing with FDuplicateDataWriter.
#jira UE-56509
Change 4027850 by Jian.Ru
Prevent log spam
#jira UE-50111
Change 4029546 by Rolando.Caloca
DR - Compile fixes
Change 4029624 by Yuriy.ODonnell
Addressed static analysis errors in MallocStomp
- VirtualAlloc return value is now explicitly checked.
- C6250 is suppressed, as VirtualFree does not release address space by design.
Change 4030225 by Yuriy.ODonnell
Static analysis warning fix: make sure declaration of Sleep() is consistent between Windows headers and TBB
The complexity with this particular case is that the warning is generated in synchapi.h, which is included by some Windows headers.
If a module includes TBB and then Windows platform headers, static analyzer will report this warning.
Suppressing it would require wrapping all instances of Windows header includes in third-party macros.
Current pragmatic solution is to modify the Sleep() declaration in TBB header to be consistent with Windows and to report the issue to Intel for a permanent fix.
Change 4030440 by Rolando.Caloca
DR - Fix crash on mobile
#jira UE-58222
Change 4030570 by Daniel.Wright
Allow null SRV's in uniform buffers for feature levels that don't support SRV's in shaders
Change 4030618 by Arne.Schober
DR - missing tangent/normal sign conversion after integration from main
#jira UE-58224
Change 4031588 by Rolando.Caloca
DR - vk - Fix compile error when missing vkCmdWriteBufferMarkerAMD
Change 4032145 by Mark.Satterthwaite
Fix UE-58268 by only emitting the base_instance/base_vertex variables required to fix-up the instance/vertex ID values to match D3D when the Metal version is 1.1 or higher, earlier versions don't support these features.
#jira UE-58268
Change 4032209 by Rolando.Caloca
DR - Fix crash on EngineTest: Mesh Batch's UserIndex is not a union anymore
Change 4033178 by Guillaume.Abadie
Fixes FXAA sampling outside viewports, that was causing black outline on bottom and right edge of the screen when ViewSize != BufferSize, problematic for some screenshot automated test.
#jira UE-58151
Change 4034489 by Daniel.Wright
Fixed UStaticMeshComponent modifying its UStaticMesh when undoing a change. This caused a crash when other static mesh components using the same mesh asset were rendered, since their rendering state was not recreated. A component should not modify its asset during PostEditUndo.
* This behavior has been present for a long time but was previously hidden because only the vertex factory of the mesh asset is cached in static draw lists, not any of its rendering resources (eg vertex declaration).
Change 4035157 by Uriel.Doyon
Fixed deadlock in the streaming code when running with -onethread.
#jira UE-58299
Change 4035198 by Rolando.Caloca
DR - vk - Fix issue when an older SDK was installed, UBT would pick it (should pick the newer of ThirdParty\Vulkan or installed SDK).
#jira UE-58267
Change 4035730 by Arne.Schober
DR - Fix missing Fog parameters during LightScattering Injection
#jira UE-57608
Change 4035843 by Daniel.Wright
Reimplemented support for EyeAdaptation node in opaque materials
Change 4036837 by Marcus.Wassmer
Replace some of the screenshots to match new un-tonemapped buffer visualization
Change 4036980 by Rolando.Caloca
DR - vk - Fix deadlock contention during mem allocation on Linux
Change 4037225 by Guillaume.Abadie
Fixes jittering selection outline.
#jira UE-58350
Change 4038056 by Marcus.Wassmer
roll back changelist 4026150. breaks a bunch of automated tests by cutting off half the image.
Change can go back in later with that part fixed also
Change 4038296 by Jian.Ru
Static analysis fix
#jira UE-58377
Change 4038402 by Ben.Marsh
Suppress IncludeTool warnings caused by CL 3998947.
Change 4038514 by Arne.Schober
DR - Fix case with MVF where instance offset is not supported by the API (in this case only foliage OpenGL and TvOS), usually the buffers are offsetted instead but with MVF we do not use offsetted buffers, therfore the offset needs to be passed into the shader although we are drawing with offset of 0.
#jira UE-57652
Change 4038747 by Marcus.Wassmer
Back out changelist 3853645, causing us to lose shadows in the shaderhair test
Change 4040138 by Rolando.Caloca
DR - Fix compile warning
Change 4041614 by Rolando.Caloca
DR - vk - Fix for Oculus module
#jira UE-58267
Change 3810277 by Daniel.Wright
Ray Traced Distance Field shadows use a two pass tile culling algorithm with no tile max - fixes flickering from tile overflow in dense areas or with a low sun angle. Costs .2ms on PS4.
The distance field scene buffers now use float4 on PS4 and Xbox, saves .1ms on PS4.
Change 3817029 by Uriel.Doyon
Added UVolumeTexture, which use 3D textures. Compressed formats are supported on DX11, DX12, PS4 and XB1.
Projects targetting OpengGL don't have access to compressed formats (as the implementation has texture tiling issues).
Add "r.AllowVolumeTextureAssetCreation" set as 0 by default, which controls whether volume texture can be sampled in materials and whether they can be created from 2D texture assets.
Platform not supporting BC7, will now fallback on RGBA8 instead of DXT to preserve quality, in an attemps to increase usage of BC7.
#jira UE-32263
Change 3819960 by Michael.Lentine
Expose UEPhysics Clothing Parameters through UI.
Change 3823401 by Rolando.Caloca
DR - Add NumQueriesInBatch to RHIBeginOcclusionQueryBatch
Change 3844805 by Arne.Schober
DR - Increased Intermediate normal of Umodel and Skelmesh from 8bit Unorm Compressed to float. A resave/rebuid/reimport of the meshes is recommended to recover some lost precision.
Fixed an issue with compressed (packed) normals on the GPU which were off by one integer representation. Also switched from UNORM to SNORM to get a discrete zero representation and removed some mads from all the VertexShaders.
Change 3847283 by Marcus.Wassmer
Extra fixes from Uriel
Change 3876607 by Rolando.Caloca
DR - Use render passes when running occlusion queries
- Removes the RHI(Begin|End)OcclusionQueryBatch API
Change 3903799 by Daniel.Wright
[Integrate] Pass Uniform Buffers
* All pass-constant shader inputs should go into the appropriate pass uniform buffer, instead of being set per-draw
* Moved many per-draw base pass parameters over to the Base Pass Uniform Buffer
* Opaque and Translucent base pass shaders have different uniform buffers, which allows compile errors when accessing an invalid resource (eg GBuffer in Opaque), instead of silently falling back to GBlackTexture
Uniform buffers can now contain nested structs with UNIFORM_MEMBER_STRUCT()
* This allows composing a uniform buffer at a particular update frequency out of many features, with encapsulation of each feature's parameters in a struct.
* Eg deferred fog uses FFogUniformParameters, but so does translucency in the base pass, where FFogUniformParameters is reused nested inside the base pass uniform buffer.
* Resources can now be located anywhere in the uniform buffer. Padding is inserted to the cbuffer representation to keep memory layouts matching. In the future the cbuffer could be compacted.
* RemoveUniformBuffersFromSource() which works around HLSLCC lack of struct initializers now handles nested structs
Change 3917500 by Rolando.Caloca
DR - Change depth bounds so only the enable bit is in the PSO, allow min/max to be dynamically modified
Change 3964907 by Guillaume.Abadie
Implements RectList topology support in RHI.
Change 3979171 by Mark.Satterthwaite
Copying //Tasks/UE4/Dev-UERNDR-354-mtlpp to Dev-Rendering (//UE4/Dev-Rendering):
Rewrites MetalRHI in terms of mtlpp, which is a C++ wrapper library built around Metal's Objective-C API that attempts to reduce overheads and eliminate resource lifetime errors.
Regarding mtlpp:
- The mtlpp library uses C++ constructor/destructor and smart-pointer style management of Objective-C retain/release calls to prevent over- and under-release problems.
- To reduce Objective-C overheads the mtlpp library caches the internal C-function that implements the Objective-C selectors for the most commonly used Metal protocol types and calls the function directly - this avoids objc_msgSend which does this look-up dynamically and thus improves CPU performance slightly.
- Another advantage is that mtlpp provides infrastructure to extend the Metal API slightly to help improve MetalRHI - the two important aspects are mtlpp::CommandBufferFence which provides a consistent CPU<->GPU synchronisation primitive and sub-buffer allocations from mtlpp::Buffer which allow for far superior memory management.
- Validation functionality is also provided by mtlpp to detect CPU vs. GPU data races and resource lifetime validation - this is expensive and is thus optional and compiled out from Shipping binaries that should be used when performance is most critical. The validation only works between resource modification and *submitted* command-buffers - anything that is being actively encoded on the CPU is ignored and it remains the responsibility of the application to validate the order of operations when encoding.
Apple Platform:
- LLM support which tracks Objective-C objects is enabled only on macOS - we don't have the necessary libraries to intercept and override the internal system calls on iOS.
MetalRHI:
- All the types are switched over, (mostly) insuling the external API from the horror of Metal and Objective-C.
- Buffers are now managed quite differently, small buffers are allocated from a magazine allocator that allocates in fixed blocks from a larger parent buffer, intermediate sized buffers are allocated from a simple heap allocator that wraps a larger buffer and anything of reasonable size (>2Mb) will use the pooled allocator. This *radically* reduces the number of buffer resources, by as much as a factor of 10, because they are now sub-allocated without the need to use MTLHeap or MTLFence so they are performance equivalent to the existing implementation on the GPU and much faster on the CPU. Total memory use is approximately the same.
- Vertex & index buffer management has been updated to reflect changes in the management and to avoid reallocating buffers which provide a Linear Texture (for SRVs) unless strictly necessary. This ensures that even in cases where a dynamic buffer is updated multiple times in a frame it will still work acceptably well.
- The Metal ring-buffer implementation is completely different again, this time it can use Managed memory on macOS which allows for much better performance on eGPUs which will be more and more important for Mac.
- Everyone that needs to wait on a command-buffer fence (rather than a command-buffer itself) now use mtlpp::CommandBufferFence, which prevents race conditions between the different command-buffer handlers (which sometimes execute out of order).
- LLM tracking should now report the same data as the MetalRHI stats group for buffer & texture allocations - there is no segmentation for Vertex/index/Structured/Uniform allocations in Metal so these numbers are going to be wrong and will need to be rethought.
- What will be unseen are the number of small but important resource usage fixes that avoid stale resources from being bound to the device after the point at which they become invalid. This should eliminate a class of errors where the GPU uses a resource pointer that is modified by the CPU and was necessary to satisfy the new mtlpp validation code.
Other:
- Remove the Metal focused workarounds from the ClothBuffer resource binding and related vertex-buffer SRV - these were put in when MetalRHI/MetalShaderFormat couldn't handle float->uint conversions correctly and they should now.
- Fix a validation error caused by trying to render a 0-sized scissor rect which is invalid in Metal and simply pointless elsewhere.
- Consistency of disabling the Manual Vertex Fetch behaviour in shaders.
#jira UERNDR-354
Change 3979312 by Rolando.Caloca
DR - Remove bogus bKeepOriginalSurface parameter in CopyToResolveTarget
Change 4005122 by Rolando.Caloca
DR - Support for PS4 Index Buffer UAVs
Change 4016298 by Guillaume.Abadie
Fixes DOF hybrid scattering on platforms that supports RectList topology.
Change 4018575 by Guillaume.Abadie
Optimises DOF's reduce pass when doing scattering compilation.
Change 4020317 by Guillaume.Abadie
Implements WaveBroadcastIntrinsics.ush.
[CL 4042226 by Marcus Wassmer in Main branch]
2018-05-01 10:36:33 -04:00
ViewUniformShaderParameters . GlobalVolumeDimension = GlobalDistanceFieldInfo . ParameterData . GlobalDFResolution ;
ViewUniformShaderParameters . GlobalVolumeTexelSize = 1.0f / GlobalDistanceFieldInfo . ParameterData . GlobalDFResolution ;
2020-09-08 17:44:06 -04:00
ViewUniformShaderParameters . MaxGlobalDFAOConeDistance = GlobalDistanceFieldInfo . ParameterData . MaxDFAOConeDistance ;
2020-07-06 18:58:26 -04:00
ViewUniformShaderParameters . NumGlobalSDFClipmaps = GlobalDistanceFieldInfo . ParameterData . NumGlobalSDFClipmaps ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3091903)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3072947 on 2016/08/01 by Uriel.Doyon
Texture GUIDs are now included in cooked builds, as they are required by the texture streamer to link build data to in game texture.
#review-3072934 @marcus.wassmer
#jira UE-34045
Change 3073301 on 2016/08/02 by Ben.Woodhouse
Fix for large spotlight culling precision issues, reported on UDN by Aaron Jacobs at Double Fine. For a full description, see the UDN post
https://udn.unrealengine.com/questions/305440/shadowed-light-flicker-caused-by-floating-point-pr.html
#jira UE-34052
Change 3073689 on 2016/08/02 by Ben.Woodhouse
Improved skin postprocess - support for full resolution, with diffuse/spec lighting combined into single RGBA (sharing chroma)
Full res lighting gives less temporal AA flickering, sharper diffuse and specular lighting in the surface (since this is now at full resolution), faster postprocessing if using a 64-bit rendertarget (on NV 980Ti).
Checkerboard rendering is controlled via the r.sss.checkerboard cvar. - 0 is off/full res, 1 is checkerboard, 2 is automatic based on scenecolor (non-checkerboard requires 64bit or more rendertarget w/separate alpha)
Tested/profiled on PC, PS4
Change 3074666 on 2016/08/02 by Daniel.Wright
Fixed stationary skylight brightness
Change 3074667 on 2016/08/02 by Daniel.Wright
Fixed r.ReflectionEnvironmentLightmapMixing
Change 3074687 on 2016/08/02 by Daniel.Wright
Disallowed DrawMaterialToRenderTarget and Begin/EndDrawCanvasToRenderTarget in construction scripts, since they don't work in game. Blutilities can be used to do blueprint rendering in the editor.
Change 3075241 on 2016/08/03 by Rolando.Caloca
DR - Fix linux compile issue & static analysis warning
Change 3075746 on 2016/08/03 by Daniel.Wright
Removed bOverride_AntiAliasingMethod and outdated ini references to PP AntiAliasingMethod
Change 3075783 on 2016/08/03 by Ryan.Brucks
#code.review Marcus.Wassmer
Added two material nodes that return Atmospheric Light Vector and Light Direction using:
View.AtmosphericFogSunColor
View.AtmosphericFogSunDirection
Nodes are called:
AtmosphericLightVector
AtmosphericLightColor
Also changed SceneRendering.cpp so that values will be grabbed from directional lights without needing an Atmospheric Fog actor in the scene.
Change 3075969 on 2016/08/03 by Uriel.Doyon
Material GUIDs are not updated anymore when parents or textures change.
Lighting now uses a hash built from the list of parents, textures and shader functions.
#review-3072980 @marcus.wassmer @daniel.wright
Change 3076116 on 2016/08/03 by Ryan.Brucks
#code.review marcus.wassmer
Fixed typo in the Caption of new Nodes "Atmospheric Light Vector" and "Atmospheric Light Color"
Change 3076456 on 2016/08/03 by Rolando.Caloca
DR - Fix geometry shader gl_Layer for SPIR-V
Change 3076730 on 2016/08/03 by Uriel.Doyon
Added user warning logic for the texture streaming build. Ran in MapCheck, BeginPlay and PreSave.
#review-3072984 @marcus.wassmer
Change 3077616 on 2016/08/04 by Daniel.Wright
Planar reflection show flags can now be edited
Change 3077621 on 2016/08/04 by Daniel.Wright
Changed default Planar Reflection DistanceFromPlaneFadeoutEnd from 600 to 100, which reduces artifacts and is a more intuitive initial setting
Change 3077792 on 2016/08/04 by Daniel.Wright
Fixed an unnecessary sky capture caused by the sky light component owned by the default ASkyLight
Change 3077799 on 2016/08/04 by Daniel.Wright
Skip RF_ArchetypeObject for reflection captures
Change 3077876 on 2016/08/04 by Marc.Olano
Noise material perf improvements
Change random number generator for Gradient-ALU (1.7x perf boost), improve speed of Voronoi noise quality level 3.
Removes integer BBS random number generators. Fewer instructions, but too slow to use (see 1.7x perf boost above)
Change 3077884 on 2016/08/04 by Daniel.Wright
Lighting channels can now be edited on components with static mobility, since dynamic lights can still affect them
Change 3078994 on 2016/08/05 by Simon.Tovey
Fix for UE-34241
Scene proxy ptr was being cached during a downcast.
Inside a call to CreateDynamicData, CheckMaterialUsage_Concurrent() was causing the scene proxy to be recreated an so the cached ptr was stale.
I've fixed the immediate issue but recreating the scene proxy here doesn't seem great. Maybe CheckMaterailUsage() should be rethought a bit.
Change 3079162 on 2016/08/05 by Ben.Woodhouse
Fix for jittering in Paper2D. Was caused by override being ignored due to a change in intiialization order for AA settings.
#jira UE-34091
Change 3079613 on 2016/08/05 by Daniel.Wright
New blueprint function ClearRenderTarget2D, which is the only way to set a render target alpha directly
New blueprint function CreateRenderTarget2D
Change 3079708 on 2016/08/05 by Uriel.Doyon
Fixed crash when building texture streaming on some levels.
Change 3079795 on 2016/08/05 by Uriel.Doyon
Fixed issue with instanced static meshes when building texture streaming.
Fixed typo with func "GetNumTextureStreamingPrimitives"
Change 3079806 on 2016/08/05 by Uriel.Doyon
Enabled PerTexture MipBias. The per texture mip bias now resets to 0 when the texture gets required at low resolution.
New scalability setting named "r.Streaming.LimitPoolSizeToVRAM" enabling the PoolSize to be limited the available VRAM
(according to GPoolSizeVRAMPercentage)
#review-3074662 @marcus.wassmer
Change 3082698 on 2016/08/09 by Daniel.Wright
Copy - CreateRenderTarget2D uses a world context object as owner, allows use in a construction script
Change 3082699 on 2016/08/09 by Daniel.Wright
Changed display name for 'Two Sided' shading model to 'Two Sided Foliage' to make it clear what it's intended to be used for
Change 3083909 on 2016/08/10 by Olaf.Piesche
#jira UE-34106
#jira UE-32784
#jira UE-31198
Reset vertex factories on mesh emitters if mesh has been reimported (if mesh package is dirty)
Change 3084645 on 2016/08/10 by Olaf.Piesche
#jira UE-30398
Fix offset added to particle collision locations.
Change 3084709 on 2016/08/10 by Daniel.Wright
Copy - Scene capture alpha is now inverted to match DrawMaterialToRenderTarget, and to allow compositing with existing render target contents
Added CompositeMode to SceneCapture2D, which can be used to addively accumulate or composite instead of the default overwrite behavior
Added bCaptureOnMovement to SceneCapture, which can be disabled so the only source of scene capturing is a manual capture by calling CaptureScene()
Change 3084783 on 2016/08/10 by Rolando.Caloca
DR - Use the first targeted rhi shader platform as the initial RHI to load on Windows
#jira UE-34510
Change 3084958 on 2016/08/10 by Daniel.Wright
Copy - Reverted cl 2938543 "Lightmass now respects owner bHidden, and bCastHiddenShadow" because it did not have backwards compatibility so breaks content using hidden light cards
Change 3086023 on 2016/08/11 by Marcus.Wassmer
Merging //UE4/Dev-Main@3085468 to Dev-Rendering (//UE4/Dev-Rendering)
#test none
Change 3086778 on 2016/08/11 by Ben.Woodhouse
Workaround for fortnite character rendering issue. Enable checkerboard rendering by default until we can fix properly
#jira UE-34561
Change 3087404 on 2016/08/12 by Rolando.Caloca
DR - Upgrade glslang to 1.0.21.1
- Added some more debug output
Change 3087524 on 2016/08/12 by Rolando.Caloca
DR - vk - Fixed StencilRef, fixed size of RHIReadSurfaceFloatData (but still returns dummy data)
Change 3087663 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix for SRGB; support for mip texture views
Change 3087735 on 2016/08/12 by Daniel.Wright
TextureRenderTarget2D's can now be up to 8192^2. Anything over 2048 pops up an 'are you sure' dialog.
Change 3087750 on 2016/08/12 by Rolando.Caloca
DR - vk - Minor renaming in prep for merge
Change 3087813 on 2016/08/12 by Rolando.Caloca
DR - vk - More minor cleanup
Change 3087819 on 2016/08/12 by Chris.Bunner
Check material function input types directly, no need to traverse connected graph.
#jira UE-32134
Change 3087901 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix RT view to use 1 mip
Fix depth buffer component swizzle
Change 3088193 on 2016/08/12 by Daniel.Wright
DFAO and RTDF shadows are enabled in High and Epic scalability settings by default
Change 3088988 on 2016/08/15 by Rolando.Caloca
DR - Add Accessors
Change 3089104 on 2016/08/15 by Olaf.Piesche
#jira UE-34241
Sceneproxy can be nullptr in FDynamicMeshEmitterData::Init if the proxy is being recreated
Change 3089208 on 2016/08/15 by Daniel.Wright
Downsampled separate translucency uses a separate view uniform buffer with correct buffer sizes
* Fixes WorldPosition in downsampled translucency
* View uniform buffer parameters are now cached on the view, to allow recreating the uniform buffer without having to rebuild the entire struct. Currently used by global distance field, downsampled separate translucency.
* Fixed the downsampled translucency depth buffer being full res used together with a smaller color target, now they are both the downsampled res
Change 3089209 on 2016/08/15 by Daniel.Wright
Fixed atmospheric fog on translucency
Change 3089457 on 2016/08/15 by Daniel.Wright
Fixed lighting build failure from UMaterialInstanceDynamic assigned to a mesh that's being exported to Lightmass. The Swarm cache entry is created using the parent's guid, causing multiple MID's with the same parent to acquire a file handle multiple times which fails after the first.
Change 3089549 on 2016/08/15 by Daniel.Wright
UMaterialInterface initializes LightingGuid to something valid - causes UMaterialInstanceDynamic to have a valid LightingGuid so they can be used in lighting builds
Change 3089703 on 2016/08/15 by Daniel.Wright
Custom expression fixup for View.RenderTargetSize
Change 3090546 on 2016/08/16 by Daniel.Wright
Hopeful fix for recycled snapshot view crash
Change 3091202 on 2016/08/16 by Daniel.Wright
Manually clear FViewInfo::CachedViewUniformShaderParameters on creating a snapshot, since memcpy is used to create the snapshot view
[CL 3091931 by Gil Gribb in Main branch]
2016-08-17 11:38:13 -04:00
2020-09-08 17:44:06 -04:00
ViewUniformShaderParameters . GlobalDistanceFieldPageAtlasTexture = OrBlack3DIfNull ( GlobalDistanceFieldInfo . ParameterData . PageAtlasTexture ) ;
2022-03-01 21:07:45 -05:00
ViewUniformShaderParameters . GlobalDistanceFieldCoverageAtlasTexture = OrBlack3DIfNull ( GlobalDistanceFieldInfo . ParameterData . CoverageAtlasTexture ) ;
2022-01-26 08:27:36 -05:00
ViewUniformShaderParameters . GlobalDistanceFieldPageTableTexture = OrBlack3DUintIfNull ( GlobalDistanceFieldInfo . ParameterData . PageTableTexture ) ;
2020-09-15 11:03:59 -04:00
ViewUniformShaderParameters . GlobalDistanceFieldMipTexture = OrBlack3DIfNull ( GlobalDistanceFieldInfo . ParameterData . MipTexture ) ;
2022-03-01 21:07:45 -05:00
ViewUniformShaderParameters . FullyCoveredExpandSurfaceScale = GLumenSceneGlobalSDFFullyCoveredExpandSurfaceScale ;
ViewUniformShaderParameters . UncoveredExpandSurfaceScale = GLumenSceneGlobalSDFUncoveredExpandSurfaceScale ;
ViewUniformShaderParameters . UncoveredMinStepScale = GLumenSceneGlobalSDFUncoveredMinStepScale ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3091903)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3072947 on 2016/08/01 by Uriel.Doyon
Texture GUIDs are now included in cooked builds, as they are required by the texture streamer to link build data to in game texture.
#review-3072934 @marcus.wassmer
#jira UE-34045
Change 3073301 on 2016/08/02 by Ben.Woodhouse
Fix for large spotlight culling precision issues, reported on UDN by Aaron Jacobs at Double Fine. For a full description, see the UDN post
https://udn.unrealengine.com/questions/305440/shadowed-light-flicker-caused-by-floating-point-pr.html
#jira UE-34052
Change 3073689 on 2016/08/02 by Ben.Woodhouse
Improved skin postprocess - support for full resolution, with diffuse/spec lighting combined into single RGBA (sharing chroma)
Full res lighting gives less temporal AA flickering, sharper diffuse and specular lighting in the surface (since this is now at full resolution), faster postprocessing if using a 64-bit rendertarget (on NV 980Ti).
Checkerboard rendering is controlled via the r.sss.checkerboard cvar. - 0 is off/full res, 1 is checkerboard, 2 is automatic based on scenecolor (non-checkerboard requires 64bit or more rendertarget w/separate alpha)
Tested/profiled on PC, PS4
Change 3074666 on 2016/08/02 by Daniel.Wright
Fixed stationary skylight brightness
Change 3074667 on 2016/08/02 by Daniel.Wright
Fixed r.ReflectionEnvironmentLightmapMixing
Change 3074687 on 2016/08/02 by Daniel.Wright
Disallowed DrawMaterialToRenderTarget and Begin/EndDrawCanvasToRenderTarget in construction scripts, since they don't work in game. Blutilities can be used to do blueprint rendering in the editor.
Change 3075241 on 2016/08/03 by Rolando.Caloca
DR - Fix linux compile issue & static analysis warning
Change 3075746 on 2016/08/03 by Daniel.Wright
Removed bOverride_AntiAliasingMethod and outdated ini references to PP AntiAliasingMethod
Change 3075783 on 2016/08/03 by Ryan.Brucks
#code.review Marcus.Wassmer
Added two material nodes that return Atmospheric Light Vector and Light Direction using:
View.AtmosphericFogSunColor
View.AtmosphericFogSunDirection
Nodes are called:
AtmosphericLightVector
AtmosphericLightColor
Also changed SceneRendering.cpp so that values will be grabbed from directional lights without needing an Atmospheric Fog actor in the scene.
Change 3075969 on 2016/08/03 by Uriel.Doyon
Material GUIDs are not updated anymore when parents or textures change.
Lighting now uses a hash built from the list of parents, textures and shader functions.
#review-3072980 @marcus.wassmer @daniel.wright
Change 3076116 on 2016/08/03 by Ryan.Brucks
#code.review marcus.wassmer
Fixed typo in the Caption of new Nodes "Atmospheric Light Vector" and "Atmospheric Light Color"
Change 3076456 on 2016/08/03 by Rolando.Caloca
DR - Fix geometry shader gl_Layer for SPIR-V
Change 3076730 on 2016/08/03 by Uriel.Doyon
Added user warning logic for the texture streaming build. Ran in MapCheck, BeginPlay and PreSave.
#review-3072984 @marcus.wassmer
Change 3077616 on 2016/08/04 by Daniel.Wright
Planar reflection show flags can now be edited
Change 3077621 on 2016/08/04 by Daniel.Wright
Changed default Planar Reflection DistanceFromPlaneFadeoutEnd from 600 to 100, which reduces artifacts and is a more intuitive initial setting
Change 3077792 on 2016/08/04 by Daniel.Wright
Fixed an unnecessary sky capture caused by the sky light component owned by the default ASkyLight
Change 3077799 on 2016/08/04 by Daniel.Wright
Skip RF_ArchetypeObject for reflection captures
Change 3077876 on 2016/08/04 by Marc.Olano
Noise material perf improvements
Change random number generator for Gradient-ALU (1.7x perf boost), improve speed of Voronoi noise quality level 3.
Removes integer BBS random number generators. Fewer instructions, but too slow to use (see 1.7x perf boost above)
Change 3077884 on 2016/08/04 by Daniel.Wright
Lighting channels can now be edited on components with static mobility, since dynamic lights can still affect them
Change 3078994 on 2016/08/05 by Simon.Tovey
Fix for UE-34241
Scene proxy ptr was being cached during a downcast.
Inside a call to CreateDynamicData, CheckMaterialUsage_Concurrent() was causing the scene proxy to be recreated an so the cached ptr was stale.
I've fixed the immediate issue but recreating the scene proxy here doesn't seem great. Maybe CheckMaterailUsage() should be rethought a bit.
Change 3079162 on 2016/08/05 by Ben.Woodhouse
Fix for jittering in Paper2D. Was caused by override being ignored due to a change in intiialization order for AA settings.
#jira UE-34091
Change 3079613 on 2016/08/05 by Daniel.Wright
New blueprint function ClearRenderTarget2D, which is the only way to set a render target alpha directly
New blueprint function CreateRenderTarget2D
Change 3079708 on 2016/08/05 by Uriel.Doyon
Fixed crash when building texture streaming on some levels.
Change 3079795 on 2016/08/05 by Uriel.Doyon
Fixed issue with instanced static meshes when building texture streaming.
Fixed typo with func "GetNumTextureStreamingPrimitives"
Change 3079806 on 2016/08/05 by Uriel.Doyon
Enabled PerTexture MipBias. The per texture mip bias now resets to 0 when the texture gets required at low resolution.
New scalability setting named "r.Streaming.LimitPoolSizeToVRAM" enabling the PoolSize to be limited the available VRAM
(according to GPoolSizeVRAMPercentage)
#review-3074662 @marcus.wassmer
Change 3082698 on 2016/08/09 by Daniel.Wright
Copy - CreateRenderTarget2D uses a world context object as owner, allows use in a construction script
Change 3082699 on 2016/08/09 by Daniel.Wright
Changed display name for 'Two Sided' shading model to 'Two Sided Foliage' to make it clear what it's intended to be used for
Change 3083909 on 2016/08/10 by Olaf.Piesche
#jira UE-34106
#jira UE-32784
#jira UE-31198
Reset vertex factories on mesh emitters if mesh has been reimported (if mesh package is dirty)
Change 3084645 on 2016/08/10 by Olaf.Piesche
#jira UE-30398
Fix offset added to particle collision locations.
Change 3084709 on 2016/08/10 by Daniel.Wright
Copy - Scene capture alpha is now inverted to match DrawMaterialToRenderTarget, and to allow compositing with existing render target contents
Added CompositeMode to SceneCapture2D, which can be used to addively accumulate or composite instead of the default overwrite behavior
Added bCaptureOnMovement to SceneCapture, which can be disabled so the only source of scene capturing is a manual capture by calling CaptureScene()
Change 3084783 on 2016/08/10 by Rolando.Caloca
DR - Use the first targeted rhi shader platform as the initial RHI to load on Windows
#jira UE-34510
Change 3084958 on 2016/08/10 by Daniel.Wright
Copy - Reverted cl 2938543 "Lightmass now respects owner bHidden, and bCastHiddenShadow" because it did not have backwards compatibility so breaks content using hidden light cards
Change 3086023 on 2016/08/11 by Marcus.Wassmer
Merging //UE4/Dev-Main@3085468 to Dev-Rendering (//UE4/Dev-Rendering)
#test none
Change 3086778 on 2016/08/11 by Ben.Woodhouse
Workaround for fortnite character rendering issue. Enable checkerboard rendering by default until we can fix properly
#jira UE-34561
Change 3087404 on 2016/08/12 by Rolando.Caloca
DR - Upgrade glslang to 1.0.21.1
- Added some more debug output
Change 3087524 on 2016/08/12 by Rolando.Caloca
DR - vk - Fixed StencilRef, fixed size of RHIReadSurfaceFloatData (but still returns dummy data)
Change 3087663 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix for SRGB; support for mip texture views
Change 3087735 on 2016/08/12 by Daniel.Wright
TextureRenderTarget2D's can now be up to 8192^2. Anything over 2048 pops up an 'are you sure' dialog.
Change 3087750 on 2016/08/12 by Rolando.Caloca
DR - vk - Minor renaming in prep for merge
Change 3087813 on 2016/08/12 by Rolando.Caloca
DR - vk - More minor cleanup
Change 3087819 on 2016/08/12 by Chris.Bunner
Check material function input types directly, no need to traverse connected graph.
#jira UE-32134
Change 3087901 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix RT view to use 1 mip
Fix depth buffer component swizzle
Change 3088193 on 2016/08/12 by Daniel.Wright
DFAO and RTDF shadows are enabled in High and Epic scalability settings by default
Change 3088988 on 2016/08/15 by Rolando.Caloca
DR - Add Accessors
Change 3089104 on 2016/08/15 by Olaf.Piesche
#jira UE-34241
Sceneproxy can be nullptr in FDynamicMeshEmitterData::Init if the proxy is being recreated
Change 3089208 on 2016/08/15 by Daniel.Wright
Downsampled separate translucency uses a separate view uniform buffer with correct buffer sizes
* Fixes WorldPosition in downsampled translucency
* View uniform buffer parameters are now cached on the view, to allow recreating the uniform buffer without having to rebuild the entire struct. Currently used by global distance field, downsampled separate translucency.
* Fixed the downsampled translucency depth buffer being full res used together with a smaller color target, now they are both the downsampled res
Change 3089209 on 2016/08/15 by Daniel.Wright
Fixed atmospheric fog on translucency
Change 3089457 on 2016/08/15 by Daniel.Wright
Fixed lighting build failure from UMaterialInstanceDynamic assigned to a mesh that's being exported to Lightmass. The Swarm cache entry is created using the parent's guid, causing multiple MID's with the same parent to acquire a file handle multiple times which fails after the first.
Change 3089549 on 2016/08/15 by Daniel.Wright
UMaterialInterface initializes LightingGuid to something valid - causes UMaterialInstanceDynamic to have a valid LightingGuid so they can be used in lighting builds
Change 3089703 on 2016/08/15 by Daniel.Wright
Custom expression fixup for View.RenderTargetSize
Change 3090546 on 2016/08/16 by Daniel.Wright
Hopeful fix for recycled snapshot view crash
Change 3091202 on 2016/08/16 by Daniel.Wright
Manually clear FViewInfo::CachedViewUniformShaderParameters on creating a snapshot, since memcpy is used to create the snapshot view
[CL 3091931 by Gil Gribb in Main branch]
2016-08-17 11:38:13 -04:00
}
2020-02-06 17:56:50 -05:00
void ReadbackDistanceFieldClipmap ( FRHICommandListImmediate & RHICmdList , FGlobalDistanceFieldInfo & GlobalDistanceFieldInfo )
{
FGlobalDistanceFieldReadback * Readback = GDFReadbackRequest ;
GDFReadbackRequest = nullptr ;
2020-09-08 17:44:06 -04:00
//FGlobalDistanceFieldClipmap& ClipMap = GlobalDistanceFieldInfo.Clipmaps[0];
2022-04-06 18:24:24 -04:00
//FTextureRHIRef SourceTexture = ClipMap.RenderTarget->GetRHI();
2020-09-08 17:44:06 -04:00
//FIntVector Size = SourceTexture->GetSizeXYZ();
2020-02-06 17:56:50 -05:00
2020-09-08 17:44:06 -04:00
//RHICmdList.Read3DSurfaceFloatData(SourceTexture, FIntRect(0, 0, Size.X, Size.Y), FIntPoint(0, Size.Z), Readback->ReadbackData);
//Readback->Bounds = ClipMap.Bounds;
//Readback->Size = Size;
2022-04-18 14:30:53 -04:00
ensureMsgf ( false , TEXT ( " #todo: Global DF readback requires a rewrite as global distance field is no longer stored in a continuous memory " ) ) ;
2020-09-08 17:44:06 -04:00
Readback - > Bounds = FBox ( FVector ( 0.0f ) , FVector ( 0.0f ) ) ;
Readback - > Size = FIntVector ( 0 ) ;
2020-02-06 17:56:50 -05:00
// Fire the callback to notify that the request is complete
DECLARE_CYCLE_STAT ( TEXT ( " FSimpleDelegateGraphTask.DistanceFieldReadbackDelegate " ) , STAT_FSimpleDelegateGraphTask_DistanceFieldReadbackDelegate , STATGROUP_TaskGraphTasks ) ;
FSimpleDelegateGraphTask : : CreateAndDispatchWhenReady (
Readback - > ReadbackComplete ,
GET_STATID ( STAT_FSimpleDelegateGraphTask_DistanceFieldReadbackDelegate ) ,
nullptr ,
Readback - > CallbackThread
) ;
}
2020-07-06 18:58:26 -04:00
class FCullObjectsToClipmapCS : public FGlobalShader
{
DECLARE_GLOBAL_SHADER ( FCullObjectsToClipmapCS ) ;
SHADER_USE_PARAMETER_STRUCT ( FCullObjectsToClipmapCS , FGlobalShader ) ;
BEGIN_SHADER_PARAMETER_STRUCT ( FParameters , )
2020-09-08 17:44:06 -04:00
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWStructuredBuffer < uint > , RWObjectIndexBuffer )
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWStructuredBuffer < uint > , RWObjectIndexNumBuffer )
Sparse, narrow band, streamed Mesh Signed Distance Fields
* SDFs are now generated, allocated from the atlas and uploaded in 8^3 bricks (7^3 unique data, half voxel padding).
* Tracing must load the brick index from the indirection table, and only bricks near the surface are stored
* 3 mips are now generated, with the lowest resolution always loaded and the other 2 streamed
* SDFs are now G8 narrow band. Lower resolution mips must be traversed when querying distance to nearest surface far away from the surface
* The Distance Field Brick Atlas is now stored for each FScene and dynamically resized based on needs with a GPU memcopy
* Brick atlas uses a 1d pooled allocator which has no fragmentation and greatly reduces packing waste over the 3d allocator
* Added new indirection for Distance Field Asset data, so that only a single entry needs to be updated when a mip is streamed in or out in scenes with millions of instances
* Compute shaders operating on distance field instances generate streaming requests, which are async read back to CPU, turned into IO requests, which are polled and when complete uploaded to atlases
* Any mesh instance inside the Global SDF extent (200m) requests mip1, and at 50m requests mip2
* Now using a batched compute scatter to upload to the distance field atlas instead of RHIUpdateTexture3d, to bypass alignment restrictions and per-upload overhead
* Distance Field streaming uses an async task to move Memcpy and IO request overhead off of the Rendering Thread
* Distance Field Visualization now computes a normal from the SDF gradient and does simple lighting to better visualize the scene representation
* Increased r.DistanceFields.MaxPerMeshResolution from 128 to 512, to better represent large objects
* Mesh SDF generation now uses an Embree point query to calculate closest unsigned distance, and then a much smaller set of rays to count backfaces for negative region determination, for a 11x speedup
* Upgraded mesh utilities to Embree 3.12.2 to get point queries
* Fixed wrong transform used for SDF normals in Lumen, causing non-uniformly scaled meshes to have incorrect Surface Cache interpolation
* Fixed Static Mesh materials not getting PostLoaded before SDF build, causing their blend modes to be wrong for the build, which corrupts the DDC. Also included those blend modes in the DDC key.
Original costs on 1080 GTX (full updates on everything and no screen traces)
10.60ms UpdateGlobalDistanceField
3.62ms LumenReflectiveTest.DirectionalLight_1 Shadowmap 1
1.73ms VoxelizeCards Clipmaps=[0,1,2,3]
0.38ms TraceCards 1 dispatch 1 groups
0.51ms TraceCards 1 dispatch 1 groups
Sparse SDF costs
12.06ms UpdateGlobalDistanceField
4.35ms LumenReflectiveTest.DirectionalLight_1 Shadowmap 1
2.30ms VoxelizeCards Clipmaps=[0,1,2,3]
0.69ms TraceCards 1 dispatch 1 groups
0.77ms TraceCards 1 dispatch 1 groups
Tested: TopazEntry PC, Reverb PC and PS5, EngineTests, QAGame, Rift, Frosty P_Construct_WP, FortGPUTestbed
#rb Krzysztof.Narkowicz
#ROBOMERGE-OWNER: Daniel.Wright
#ROBOMERGE-AUTHOR: daniel.wright
#ROBOMERGE-SOURCE: CL 15784493 in //UE5/Release-5.0-EarlyAccess/...
#ROBOMERGE-BOT: STARSHIP (Release-5.0-EarlyAccess -> Main) (v783-15756269)
#ROBOMERGE-CONFLICT from-shelf
[CL 15790658 by Daniel Wright in ue5-main branch]
2021-03-23 22:40:05 -04:00
SHADER_PARAMETER_STRUCT_INCLUDE ( FDistanceFieldObjectBufferParameters , DistanceFieldObjectBuffers )
2021-05-05 15:07:25 -04:00
SHADER_PARAMETER ( FVector3f , ClipmapWorldCenter )
SHADER_PARAMETER ( FVector3f , ClipmapWorldExtent )
2020-07-06 18:58:26 -04:00
SHADER_PARAMETER ( uint32 , AcceptOftenMovingObjectsOnly )
SHADER_PARAMETER ( float , MeshSDFRadiusThreshold )
SHADER_PARAMETER ( float , InfluenceRadiusSq )
END_SHADER_PARAMETER_STRUCT ( )
static bool ShouldCompilePermutation ( const FGlobalShaderPermutationParameters & Parameters )
{
2021-07-12 10:24:46 -04:00
return ShouldCompileDistanceFieldShaders ( Parameters . Platform ) ;
2020-07-06 18:58:26 -04:00
}
static int32 GetGroupSize ( )
{
return 64 ;
}
static void ModifyCompilationEnvironment ( const FGlobalShaderPermutationParameters & Parameters , FShaderCompilerEnvironment & OutEnvironment )
{
FGlobalShader : : ModifyCompilationEnvironment ( Parameters , OutEnvironment ) ;
OutEnvironment . SetDefine ( TEXT ( " CULLOBJECTS_THREADGROUP_SIZE " ) , GetGroupSize ( ) ) ;
}
} ;
2022-04-22 19:55:41 -04:00
IMPLEMENT_GLOBAL_SHADER ( FCullObjectsToClipmapCS , " /Engine/Private/DistanceField/GlobalDistanceField.usf " , " CullObjectsToClipmapCS " , SF_Compute ) ;
2020-07-06 18:58:26 -04:00
class FClearIndirectArgBufferCS : public FGlobalShader
{
DECLARE_GLOBAL_SHADER ( FClearIndirectArgBufferCS ) ;
SHADER_USE_PARAMETER_STRUCT ( FClearIndirectArgBufferCS , FGlobalShader ) ;
BEGIN_SHADER_PARAMETER_STRUCT ( FParameters , )
2020-09-08 17:44:06 -04:00
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWBuffer < uint > , RWPageUpdateIndirectArgBuffer )
2022-04-22 19:55:41 -04:00
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWBuffer < uint > , RWCullGridUpdateIndirectArgBuffer )
2020-09-08 17:44:06 -04:00
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWBuffer < uint > , RWPageComposeIndirectArgBuffer )
2020-07-06 18:58:26 -04:00
END_SHADER_PARAMETER_STRUCT ( )
static bool ShouldCompilePermutation ( const FGlobalShaderPermutationParameters & Parameters )
{
2021-07-12 10:24:46 -04:00
return ShouldCompileDistanceFieldShaders ( Parameters . Platform ) ;
2020-07-06 18:58:26 -04:00
}
static int32 GetGroupSize ( )
{
return 1 ;
}
static void ModifyCompilationEnvironment ( const FGlobalShaderPermutationParameters & Parameters , FShaderCompilerEnvironment & OutEnvironment )
{
FGlobalShader : : ModifyCompilationEnvironment ( Parameters , OutEnvironment ) ;
OutEnvironment . SetDefine ( TEXT ( " THREADGROUP_SIZE " ) , GetGroupSize ( ) ) ;
}
} ;
2022-04-22 19:55:41 -04:00
IMPLEMENT_GLOBAL_SHADER ( FClearIndirectArgBufferCS , " /Engine/Private/DistanceField/GlobalDistanceField.usf " , " ClearIndirectArgBufferCS " , SF_Compute ) ;
2020-07-06 18:58:26 -04:00
class FBuildGridTilesCS : public FGlobalShader
{
DECLARE_GLOBAL_SHADER ( FBuildGridTilesCS ) ;
SHADER_USE_PARAMETER_STRUCT ( FBuildGridTilesCS , FGlobalShader ) ;
BEGIN_SHADER_PARAMETER_STRUCT ( FParameters , )
2020-09-08 17:44:06 -04:00
SHADER_PARAMETER_STRUCT_REF ( FViewUniformShaderParameters , View )
2022-04-22 19:55:41 -04:00
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWStructuredBuffer < uint > , RWPageTileBuffer )
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWStructuredBuffer < uint > , RWPageIndirectArgBuffer )
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWStructuredBuffer < uint > , RWCullGridTileBuffer )
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWStructuredBuffer < uint > , RWCullGridIndirectArgBuffer )
2020-07-06 18:58:26 -04:00
SHADER_PARAMETER_RDG_BUFFER_SRV ( Buffer < float4 > , UpdateBoundsBuffer )
SHADER_PARAMETER ( uint32 , NumUpdateBounds )
SHADER_PARAMETER ( float , InfluenceRadiusSq )
2022-04-22 19:55:41 -04:00
SHADER_PARAMETER ( FIntVector , PageGridResolution )
SHADER_PARAMETER ( FVector3f , PageGridCoordToWorldCenterScale )
SHADER_PARAMETER ( FVector3f , PageGridCoordToWorldCenterBias )
SHADER_PARAMETER ( FVector3f , PageGridTileWorldExtent )
SHADER_PARAMETER ( FIntVector , CullGridResolution )
SHADER_PARAMETER ( FVector3f , CullGridCoordToWorldCenterScale )
SHADER_PARAMETER ( FVector3f , CullGridCoordToWorldCenterBias )
SHADER_PARAMETER ( FVector3f , CullGridTileWorldExtent )
2020-07-06 18:58:26 -04:00
END_SHADER_PARAMETER_STRUCT ( )
static bool ShouldCompilePermutation ( const FGlobalShaderPermutationParameters & Parameters )
{
2021-07-12 10:24:46 -04:00
return ShouldCompileDistanceFieldShaders ( Parameters . Platform ) ;
2020-07-06 18:58:26 -04:00
}
static int32 GetGroupSize ( )
{
2022-04-22 19:55:41 -04:00
static_assert ( GlobalDistanceField : : CullGridFactor = = 4 , " Shader is hard coded for CullGridFactor=4 " ) ;
return 4 ;
2020-07-06 18:58:26 -04:00
}
static void ModifyCompilationEnvironment ( const FGlobalShaderPermutationParameters & Parameters , FShaderCompilerEnvironment & OutEnvironment )
{
FGlobalShader : : ModifyCompilationEnvironment ( Parameters , OutEnvironment ) ;
OutEnvironment . SetDefine ( TEXT ( " THREADGROUP_SIZE " ) , GetGroupSize ( ) ) ;
}
} ;
2022-04-22 19:55:41 -04:00
IMPLEMENT_GLOBAL_SHADER ( FBuildGridTilesCS , " /Engine/Private/DistanceField/GlobalDistanceField.usf " , " BuildGridTilesCS " , SF_Compute ) ;
2020-07-06 18:58:26 -04:00
class FCullObjectsToGridCS : public FGlobalShader
{
DECLARE_GLOBAL_SHADER ( FCullObjectsToGridCS ) ;
SHADER_USE_PARAMETER_STRUCT ( FCullObjectsToGridCS , FGlobalShader ) ;
BEGIN_SHADER_PARAMETER_STRUCT ( FParameters , )
2020-09-08 17:44:06 -04:00
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWStructuredBuffer < uint > , RWCullGridAllocator )
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWStructuredBuffer < uint > , RWCullGridObjectHeader )
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWStructuredBuffer < uint > , RWCullGridObjectArray )
2021-03-17 06:01:59 -04:00
RDG_BUFFER_ACCESS ( CullGridIndirectArgBuffer , ERHIAccess : : IndirectArgs )
2020-09-08 17:44:06 -04:00
SHADER_PARAMETER_RDG_BUFFER_SRV ( StructuredBuffer < uint > , CullGridTileBuffer )
SHADER_PARAMETER_RDG_BUFFER_SRV ( StructuredBuffer < uint > , ObjectIndexBuffer )
SHADER_PARAMETER_RDG_BUFFER_SRV ( StructuredBuffer < uint > , ObjectIndexNumBuffer )
Sparse, narrow band, streamed Mesh Signed Distance Fields
* SDFs are now generated, allocated from the atlas and uploaded in 8^3 bricks (7^3 unique data, half voxel padding).
* Tracing must load the brick index from the indirection table, and only bricks near the surface are stored
* 3 mips are now generated, with the lowest resolution always loaded and the other 2 streamed
* SDFs are now G8 narrow band. Lower resolution mips must be traversed when querying distance to nearest surface far away from the surface
* The Distance Field Brick Atlas is now stored for each FScene and dynamically resized based on needs with a GPU memcopy
* Brick atlas uses a 1d pooled allocator which has no fragmentation and greatly reduces packing waste over the 3d allocator
* Added new indirection for Distance Field Asset data, so that only a single entry needs to be updated when a mip is streamed in or out in scenes with millions of instances
* Compute shaders operating on distance field instances generate streaming requests, which are async read back to CPU, turned into IO requests, which are polled and when complete uploaded to atlases
* Any mesh instance inside the Global SDF extent (200m) requests mip1, and at 50m requests mip2
* Now using a batched compute scatter to upload to the distance field atlas instead of RHIUpdateTexture3d, to bypass alignment restrictions and per-upload overhead
* Distance Field streaming uses an async task to move Memcpy and IO request overhead off of the Rendering Thread
* Distance Field Visualization now computes a normal from the SDF gradient and does simple lighting to better visualize the scene representation
* Increased r.DistanceFields.MaxPerMeshResolution from 128 to 512, to better represent large objects
* Mesh SDF generation now uses an Embree point query to calculate closest unsigned distance, and then a much smaller set of rays to count backfaces for negative region determination, for a 11x speedup
* Upgraded mesh utilities to Embree 3.12.2 to get point queries
* Fixed wrong transform used for SDF normals in Lumen, causing non-uniformly scaled meshes to have incorrect Surface Cache interpolation
* Fixed Static Mesh materials not getting PostLoaded before SDF build, causing their blend modes to be wrong for the build, which corrupts the DDC. Also included those blend modes in the DDC key.
Original costs on 1080 GTX (full updates on everything and no screen traces)
10.60ms UpdateGlobalDistanceField
3.62ms LumenReflectiveTest.DirectionalLight_1 Shadowmap 1
1.73ms VoxelizeCards Clipmaps=[0,1,2,3]
0.38ms TraceCards 1 dispatch 1 groups
0.51ms TraceCards 1 dispatch 1 groups
Sparse SDF costs
12.06ms UpdateGlobalDistanceField
4.35ms LumenReflectiveTest.DirectionalLight_1 Shadowmap 1
2.30ms VoxelizeCards Clipmaps=[0,1,2,3]
0.69ms TraceCards 1 dispatch 1 groups
0.77ms TraceCards 1 dispatch 1 groups
Tested: TopazEntry PC, Reverb PC and PS5, EngineTests, QAGame, Rift, Frosty P_Construct_WP, FortGPUTestbed
#rb Krzysztof.Narkowicz
#ROBOMERGE-OWNER: Daniel.Wright
#ROBOMERGE-AUTHOR: daniel.wright
#ROBOMERGE-SOURCE: CL 15784493 in //UE5/Release-5.0-EarlyAccess/...
#ROBOMERGE-BOT: STARSHIP (Release-5.0-EarlyAccess -> Main) (v783-15756269)
#ROBOMERGE-CONFLICT from-shelf
[CL 15790658 by Daniel Wright in ue5-main branch]
2021-03-23 22:40:05 -04:00
SHADER_PARAMETER_STRUCT_INCLUDE ( FDistanceFieldObjectBufferParameters , DistanceFieldObjectBuffers )
2020-07-06 18:58:26 -04:00
SHADER_PARAMETER ( FIntVector , CullGridResolution )
2021-05-05 15:07:25 -04:00
SHADER_PARAMETER ( FVector3f , CullGridCoordToWorldCenterScale )
SHADER_PARAMETER ( FVector3f , CullGridCoordToWorldCenterBias )
SHADER_PARAMETER ( FVector3f , CullTileWorldExtent )
2020-07-06 18:58:26 -04:00
SHADER_PARAMETER ( float , InfluenceRadiusSq )
END_SHADER_PARAMETER_STRUCT ( )
static bool ShouldCompilePermutation ( const FGlobalShaderPermutationParameters & Parameters )
{
2021-07-12 10:24:46 -04:00
return ShouldCompileDistanceFieldShaders ( Parameters . Platform ) ;
2020-07-06 18:58:26 -04:00
}
static void ModifyCompilationEnvironment ( const FGlobalShaderPermutationParameters & Parameters , FShaderCompilerEnvironment & OutEnvironment )
{
FGlobalShader : : ModifyCompilationEnvironment ( Parameters , OutEnvironment ) ;
}
} ;
2022-04-22 19:55:41 -04:00
IMPLEMENT_GLOBAL_SHADER ( FCullObjectsToGridCS , " /Engine/Private/DistanceField/GlobalDistanceField.usf " , " CullObjectsToGridCS " , SF_Compute ) ;
2020-07-06 18:58:26 -04:00
2020-09-08 17:44:06 -04:00
class FComposeObjectsIntoPagesCS : public FGlobalShader
2020-07-06 18:58:26 -04:00
{
2020-09-08 17:44:06 -04:00
DECLARE_GLOBAL_SHADER ( FComposeObjectsIntoPagesCS ) ;
SHADER_USE_PARAMETER_STRUCT ( FComposeObjectsIntoPagesCS , FGlobalShader ) ;
2020-07-06 18:58:26 -04:00
BEGIN_SHADER_PARAMETER_STRUCT ( FParameters , )
SHADER_PARAMETER_STRUCT_REF ( FViewUniformShaderParameters , View )
2022-03-01 21:07:45 -05:00
SHADER_PARAMETER_RDG_TEXTURE_UAV ( RWTexture3D < UNORM float > , RWPageAtlasTexture )
SHADER_PARAMETER_RDG_TEXTURE_UAV ( RWTexture3D < UNORM float > , RWCoverageAtlasTexture )
2021-03-17 06:01:59 -04:00
RDG_BUFFER_ACCESS ( ComposeIndirectArgBuffer , ERHIAccess : : IndirectArgs )
2020-09-08 17:44:06 -04:00
SHADER_PARAMETER_RDG_BUFFER_SRV ( StructuredBuffer < uint > , ComposeTileBuffer )
SHADER_PARAMETER_RDG_BUFFER_SRV ( StructuredBuffer < uint > , HeightfieldMarkedPageBuffer )
SHADER_PARAMETER_RDG_TEXTURE ( Texture3D < uint > , PageTableLayerTexture )
SHADER_PARAMETER_RDG_TEXTURE ( Texture3D < uint > , ParentPageTableLayerTexture )
SHADER_PARAMETER_RDG_BUFFER_SRV ( StructuredBuffer < uint > , CullGridObjectHeader )
SHADER_PARAMETER_RDG_BUFFER_SRV ( StructuredBuffer < uint > , CullGridObjectArray )
SHADER_PARAMETER_RDG_BUFFER_SRV ( StructuredBuffer < uint > , ObjectIndexNumBuffer )
SHADER_PARAMETER_RDG_BUFFER_SRV ( StructuredBuffer < uint > , ObjectIndexBuffer )
Sparse, narrow band, streamed Mesh Signed Distance Fields
* SDFs are now generated, allocated from the atlas and uploaded in 8^3 bricks (7^3 unique data, half voxel padding).
* Tracing must load the brick index from the indirection table, and only bricks near the surface are stored
* 3 mips are now generated, with the lowest resolution always loaded and the other 2 streamed
* SDFs are now G8 narrow band. Lower resolution mips must be traversed when querying distance to nearest surface far away from the surface
* The Distance Field Brick Atlas is now stored for each FScene and dynamically resized based on needs with a GPU memcopy
* Brick atlas uses a 1d pooled allocator which has no fragmentation and greatly reduces packing waste over the 3d allocator
* Added new indirection for Distance Field Asset data, so that only a single entry needs to be updated when a mip is streamed in or out in scenes with millions of instances
* Compute shaders operating on distance field instances generate streaming requests, which are async read back to CPU, turned into IO requests, which are polled and when complete uploaded to atlases
* Any mesh instance inside the Global SDF extent (200m) requests mip1, and at 50m requests mip2
* Now using a batched compute scatter to upload to the distance field atlas instead of RHIUpdateTexture3d, to bypass alignment restrictions and per-upload overhead
* Distance Field streaming uses an async task to move Memcpy and IO request overhead off of the Rendering Thread
* Distance Field Visualization now computes a normal from the SDF gradient and does simple lighting to better visualize the scene representation
* Increased r.DistanceFields.MaxPerMeshResolution from 128 to 512, to better represent large objects
* Mesh SDF generation now uses an Embree point query to calculate closest unsigned distance, and then a much smaller set of rays to count backfaces for negative region determination, for a 11x speedup
* Upgraded mesh utilities to Embree 3.12.2 to get point queries
* Fixed wrong transform used for SDF normals in Lumen, causing non-uniformly scaled meshes to have incorrect Surface Cache interpolation
* Fixed Static Mesh materials not getting PostLoaded before SDF build, causing their blend modes to be wrong for the build, which corrupts the DDC. Also included those blend modes in the DDC key.
Original costs on 1080 GTX (full updates on everything and no screen traces)
10.60ms UpdateGlobalDistanceField
3.62ms LumenReflectiveTest.DirectionalLight_1 Shadowmap 1
1.73ms VoxelizeCards Clipmaps=[0,1,2,3]
0.38ms TraceCards 1 dispatch 1 groups
0.51ms TraceCards 1 dispatch 1 groups
Sparse SDF costs
12.06ms UpdateGlobalDistanceField
4.35ms LumenReflectiveTest.DirectionalLight_1 Shadowmap 1
2.30ms VoxelizeCards Clipmaps=[0,1,2,3]
0.69ms TraceCards 1 dispatch 1 groups
0.77ms TraceCards 1 dispatch 1 groups
Tested: TopazEntry PC, Reverb PC and PS5, EngineTests, QAGame, Rift, Frosty P_Construct_WP, FortGPUTestbed
#rb Krzysztof.Narkowicz
#ROBOMERGE-OWNER: Daniel.Wright
#ROBOMERGE-AUTHOR: daniel.wright
#ROBOMERGE-SOURCE: CL 15784493 in //UE5/Release-5.0-EarlyAccess/...
#ROBOMERGE-BOT: STARSHIP (Release-5.0-EarlyAccess -> Main) (v783-15756269)
#ROBOMERGE-CONFLICT from-shelf
[CL 15790658 by Daniel Wright in ue5-main branch]
2021-03-23 22:40:05 -04:00
SHADER_PARAMETER_STRUCT_INCLUDE ( FDistanceFieldObjectBufferParameters , DistanceFieldObjectBuffers )
SHADER_PARAMETER_STRUCT_INCLUDE ( FDistanceFieldAtlasParameters , DistanceFieldAtlas )
2020-09-08 17:44:06 -04:00
SHADER_PARAMETER ( float , ClipmapVoxelExtent )
2020-07-06 18:58:26 -04:00
SHADER_PARAMETER ( float , InfluenceRadius )
SHADER_PARAMETER ( float , InfluenceRadiusSq )
SHADER_PARAMETER ( FIntVector , CullGridResolution )
SHADER_PARAMETER ( FIntVector , GlobalDistanceFieldScrollOffset )
2021-05-05 15:07:25 -04:00
SHADER_PARAMETER ( FVector3f , GlobalDistanceFieldInvPageAtlasSize )
SHADER_PARAMETER ( FVector3f , InvPageGridResolution )
2020-09-08 17:44:06 -04:00
SHADER_PARAMETER ( FIntVector , PageGridResolution )
2020-07-06 18:58:26 -04:00
SHADER_PARAMETER ( FIntVector , ClipmapResolution )
2021-05-05 15:07:25 -04:00
SHADER_PARAMETER ( FVector3f , PageCoordToVoxelCenterScale )
SHADER_PARAMETER ( FVector3f , PageCoordToVoxelCenterBias )
SHADER_PARAMETER ( FVector3f , PageCoordToPageWorldCenterScale )
SHADER_PARAMETER ( FVector3f , PageCoordToPageWorldCenterBias )
2021-09-22 10:01:48 -04:00
SHADER_PARAMETER ( FVector4f , ClipmapVolumeWorldToUVAddAndMul )
2021-05-05 15:07:25 -04:00
SHADER_PARAMETER ( FVector3f , ComposeTileWorldExtent )
SHADER_PARAMETER ( FVector3f , ClipmapMinBounds )
2020-09-08 17:44:06 -04:00
SHADER_PARAMETER ( uint32 , PageTableClipmapOffsetZ )
2020-07-06 18:58:26 -04:00
END_SHADER_PARAMETER_STRUCT ( )
class FComposeParentDistanceField : SHADER_PERMUTATION_BOOL ( " COMPOSE_PARENT_DISTANCE_FIELD " ) ;
2022-02-02 05:42:31 -05:00
class FProcessDistanceFields : SHADER_PERMUTATION_BOOL ( " PROCESS_DISTANCE_FIELDS " ) ;
2022-03-01 21:07:45 -05:00
class FCompositeCoverageAtlas : SHADER_PERMUTATION_BOOL ( " COMPOSITE_COVERAGE_ATLAS " ) ;
2022-04-25 07:32:32 -04:00
class FOffsetDataStructure : SHADER_PERMUTATION_INT ( " OFFSET_DATA_STRUCT " , 3 ) ;
using FPermutationDomain = TShaderPermutationDomain < FComposeParentDistanceField , FProcessDistanceFields , FCompositeCoverageAtlas , FOffsetDataStructure > ;
2020-07-06 18:58:26 -04:00
static bool ShouldCompilePermutation ( const FGlobalShaderPermutationParameters & Parameters )
{
2021-07-12 10:24:46 -04:00
return ShouldCompileDistanceFieldShaders ( Parameters . Platform ) ;
2020-07-06 18:58:26 -04:00
}
static FIntVector GetGroupSize ( )
{
return FIntVector ( 4 , 4 , 4 ) ;
}
static void ModifyCompilationEnvironment ( const FGlobalShaderPermutationParameters & Parameters , FShaderCompilerEnvironment & OutEnvironment )
{
FGlobalShader : : ModifyCompilationEnvironment ( Parameters , OutEnvironment ) ;
OutEnvironment . SetDefine ( TEXT ( " THREADGROUP_SIZE " ) , GetGroupSize ( ) . X ) ;
OutEnvironment . SetDefine ( TEXT ( " COMPOSITE_THREADGROUP_SIZEX " ) , GetGroupSize ( ) . X ) ;
OutEnvironment . SetDefine ( TEXT ( " COMPOSITE_THREADGROUP_SIZEY " ) , GetGroupSize ( ) . Y ) ;
OutEnvironment . SetDefine ( TEXT ( " COMPOSITE_THREADGROUP_SIZEZ " ) , GetGroupSize ( ) . Z ) ;
}
} ;
2022-04-22 19:55:41 -04:00
IMPLEMENT_GLOBAL_SHADER ( FComposeObjectsIntoPagesCS , " /Engine/Private/DistanceField/GlobalDistanceField.usf " , " ComposeObjectsIntoPagesCS " , SF_Compute ) ;
2020-09-08 17:44:06 -04:00
class FInitPageFreeListCS : public FGlobalShader
{
DECLARE_GLOBAL_SHADER ( FInitPageFreeListCS ) ;
SHADER_USE_PARAMETER_STRUCT ( FInitPageFreeListCS , FGlobalShader ) ;
BEGIN_SHADER_PARAMETER_STRUCT ( FParameters , )
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWStructuredBuffer < uint > , RWPageFreeListBuffer )
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWStructuredBuffer < int , RWPageFreeListAllocatorBuffer )
SHADER_PARAMETER ( uint32 , GlobalDistanceFieldMaxPageNum )
END_SHADER_PARAMETER_STRUCT ( )
static bool ShouldCompilePermutation ( const FGlobalShaderPermutationParameters & Parameters )
{
2021-07-12 10:24:46 -04:00
return ShouldCompileDistanceFieldShaders ( Parameters . Platform ) ;
2020-09-08 17:44:06 -04:00
}
static uint32 GetGroupSize ( )
{
return 64 ;
}
static void ModifyCompilationEnvironment ( const FGlobalShaderPermutationParameters & Parameters , FShaderCompilerEnvironment & OutEnvironment )
{
FGlobalShader : : ModifyCompilationEnvironment ( Parameters , OutEnvironment ) ;
OutEnvironment . SetDefine ( TEXT ( " THREADGROUP_SIZE " ) , GetGroupSize ( ) ) ;
}
} ;
2022-04-22 19:55:41 -04:00
IMPLEMENT_GLOBAL_SHADER ( FInitPageFreeListCS , " /Engine/Private/DistanceField/GlobalDistanceField.usf " , " InitPageFreeListCS " , SF_Compute ) ;
2020-09-08 17:44:06 -04:00
class FAllocatePagesCS : public FGlobalShader
{
DECLARE_GLOBAL_SHADER ( FAllocatePagesCS ) ;
SHADER_USE_PARAMETER_STRUCT ( FAllocatePagesCS , FGlobalShader ) ;
BEGIN_SHADER_PARAMETER_STRUCT ( FParameters , )
SHADER_PARAMETER_STRUCT_REF ( FViewUniformShaderParameters , View )
2021-03-17 06:01:59 -04:00
RDG_BUFFER_ACCESS ( PageUpdateIndirectArgBuffer , ERHIAccess : : IndirectArgs )
2020-09-08 17:44:06 -04:00
SHADER_PARAMETER_RDG_BUFFER_SRV ( StructuredBuffer < uint > , PageUpdateTileBuffer )
SHADER_PARAMETER_RDG_BUFFER_SRV ( StructuredBuffer < uint > , MarkedHeightfieldPageBuffer )
SHADER_PARAMETER_RDG_TEXTURE_UAV ( RWTexture3D < uint > , RWPageTableCombinedTexture )
SHADER_PARAMETER_RDG_TEXTURE_UAV ( RWTexture3D < uint > , RWPageTableLayerTexture )
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWStructuredBuffer < int > , RWPageFreeListAllocatorBuffer )
SHADER_PARAMETER_RDG_BUFFER_SRV ( StructuredBuffer < uint > , PageFreeListBuffer )
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWStructuredBuffer < uint > , RWPageFreeListReturnAllocatorBuffer )
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWStructuredBuffer < uint > , RWPageFreeListReturnBuffer )
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWStructuredBuffer < uint > , RWPageComposeTileBuffer )
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWBuffer < uint > , RWPageComposeIndirectArgBuffer )
SHADER_PARAMETER_RDG_TEXTURE ( Texture3D < uint > , ParentPageTableLayerTexture )
2021-05-05 15:07:25 -04:00
SHADER_PARAMETER ( FVector3f , InvPageGridResolution )
2020-09-08 17:44:06 -04:00
SHADER_PARAMETER ( FIntVector , PageGridResolution )
SHADER_PARAMETER ( uint32 , GlobalDistanceFieldMaxPageNum )
SHADER_PARAMETER ( uint32 , PageTableClipmapOffsetZ )
2021-05-05 15:07:25 -04:00
SHADER_PARAMETER ( FVector3f , PageWorldExtent )
2020-09-08 17:44:06 -04:00
SHADER_PARAMETER ( float , PageWorldRadius )
SHADER_PARAMETER ( float , ClipmapInfluenceRadius )
2021-05-05 15:07:25 -04:00
SHADER_PARAMETER ( FVector3f , PageCoordToPageWorldCenterScale )
SHADER_PARAMETER ( FVector3f , PageCoordToPageWorldCenterBias )
2021-09-22 10:01:48 -04:00
SHADER_PARAMETER ( FVector4f , ClipmapVolumeWorldToUVAddAndMul )
2020-09-08 17:44:06 -04:00
SHADER_PARAMETER_RDG_BUFFER_SRV ( StructuredBuffer < uint > , CullGridObjectHeader )
SHADER_PARAMETER_RDG_BUFFER_SRV ( StructuredBuffer < uint > , CullGridObjectArray )
SHADER_PARAMETER ( FIntVector , CullGridResolution )
Sparse, narrow band, streamed Mesh Signed Distance Fields
* SDFs are now generated, allocated from the atlas and uploaded in 8^3 bricks (7^3 unique data, half voxel padding).
* Tracing must load the brick index from the indirection table, and only bricks near the surface are stored
* 3 mips are now generated, with the lowest resolution always loaded and the other 2 streamed
* SDFs are now G8 narrow band. Lower resolution mips must be traversed when querying distance to nearest surface far away from the surface
* The Distance Field Brick Atlas is now stored for each FScene and dynamically resized based on needs with a GPU memcopy
* Brick atlas uses a 1d pooled allocator which has no fragmentation and greatly reduces packing waste over the 3d allocator
* Added new indirection for Distance Field Asset data, so that only a single entry needs to be updated when a mip is streamed in or out in scenes with millions of instances
* Compute shaders operating on distance field instances generate streaming requests, which are async read back to CPU, turned into IO requests, which are polled and when complete uploaded to atlases
* Any mesh instance inside the Global SDF extent (200m) requests mip1, and at 50m requests mip2
* Now using a batched compute scatter to upload to the distance field atlas instead of RHIUpdateTexture3d, to bypass alignment restrictions and per-upload overhead
* Distance Field streaming uses an async task to move Memcpy and IO request overhead off of the Rendering Thread
* Distance Field Visualization now computes a normal from the SDF gradient and does simple lighting to better visualize the scene representation
* Increased r.DistanceFields.MaxPerMeshResolution from 128 to 512, to better represent large objects
* Mesh SDF generation now uses an Embree point query to calculate closest unsigned distance, and then a much smaller set of rays to count backfaces for negative region determination, for a 11x speedup
* Upgraded mesh utilities to Embree 3.12.2 to get point queries
* Fixed wrong transform used for SDF normals in Lumen, causing non-uniformly scaled meshes to have incorrect Surface Cache interpolation
* Fixed Static Mesh materials not getting PostLoaded before SDF build, causing their blend modes to be wrong for the build, which corrupts the DDC. Also included those blend modes in the DDC key.
Original costs on 1080 GTX (full updates on everything and no screen traces)
10.60ms UpdateGlobalDistanceField
3.62ms LumenReflectiveTest.DirectionalLight_1 Shadowmap 1
1.73ms VoxelizeCards Clipmaps=[0,1,2,3]
0.38ms TraceCards 1 dispatch 1 groups
0.51ms TraceCards 1 dispatch 1 groups
Sparse SDF costs
12.06ms UpdateGlobalDistanceField
4.35ms LumenReflectiveTest.DirectionalLight_1 Shadowmap 1
2.30ms VoxelizeCards Clipmaps=[0,1,2,3]
0.69ms TraceCards 1 dispatch 1 groups
0.77ms TraceCards 1 dispatch 1 groups
Tested: TopazEntry PC, Reverb PC and PS5, EngineTests, QAGame, Rift, Frosty P_Construct_WP, FortGPUTestbed
#rb Krzysztof.Narkowicz
#ROBOMERGE-OWNER: Daniel.Wright
#ROBOMERGE-AUTHOR: daniel.wright
#ROBOMERGE-SOURCE: CL 15784493 in //UE5/Release-5.0-EarlyAccess/...
#ROBOMERGE-BOT: STARSHIP (Release-5.0-EarlyAccess -> Main) (v783-15756269)
#ROBOMERGE-CONFLICT from-shelf
[CL 15790658 by Daniel Wright in ue5-main branch]
2021-03-23 22:40:05 -04:00
SHADER_PARAMETER_STRUCT_INCLUDE ( FDistanceFieldObjectBufferParameters , DistanceFieldObjectBuffers )
SHADER_PARAMETER_STRUCT_INCLUDE ( FDistanceFieldAtlasParameters , DistanceFieldAtlas )
2020-09-08 17:44:06 -04:00
END_SHADER_PARAMETER_STRUCT ( )
2022-01-31 04:59:02 -05:00
class FProcessDistanceFields : SHADER_PERMUTATION_BOOL ( " PROCESS_DISTANCE_FIELDS " ) ;
2020-09-08 17:44:06 -04:00
class FMarkedHeightfieldPageBuffer : SHADER_PERMUTATION_BOOL ( " MARKED_HEIGHTFIELD_PAGE_BUFFER " ) ;
class FComposeParentDistanceField : SHADER_PERMUTATION_BOOL ( " COMPOSE_PARENT_DISTANCE_FIELD " ) ;
2022-04-25 07:32:32 -04:00
class FOffsetDataStructure : SHADER_PERMUTATION_INT ( " OFFSET_DATA_STRUCT " , 3 ) ;
using FPermutationDomain = TShaderPermutationDomain < FProcessDistanceFields , FMarkedHeightfieldPageBuffer , FComposeParentDistanceField , FOffsetDataStructure > ;
2020-09-08 17:44:06 -04:00
static FPermutationDomain RemapPermutation ( FPermutationDomain PermutationVector )
{
if ( PermutationVector . Get < FComposeParentDistanceField > ( ) )
{
PermutationVector . Set < FMarkedHeightfieldPageBuffer > ( false ) ;
}
return PermutationVector ;
}
static bool ShouldCompilePermutation ( const FGlobalShaderPermutationParameters & Parameters )
{
FPermutationDomain PermutationVector ( Parameters . PermutationId ) ;
if ( RemapPermutation ( PermutationVector ) ! = PermutationVector )
{
return false ;
}
2021-07-12 10:24:46 -04:00
return ShouldCompileDistanceFieldShaders ( Parameters . Platform ) ;
2020-09-08 17:44:06 -04:00
}
static FIntVector GetGroupSize ( )
{
return FIntVector ( 64 , 1 , 1 ) ;
}
static void ModifyCompilationEnvironment ( const FGlobalShaderPermutationParameters & Parameters , FShaderCompilerEnvironment & OutEnvironment )
{
FGlobalShader : : ModifyCompilationEnvironment ( Parameters , OutEnvironment ) ;
OutEnvironment . SetDefine ( TEXT ( " THREADGROUP_SIZE_X " ) , GetGroupSize ( ) . X ) ;
OutEnvironment . SetDefine ( TEXT ( " THREADGROUP_SIZE_Y " ) , GetGroupSize ( ) . Y ) ;
OutEnvironment . SetDefine ( TEXT ( " THREADGROUP_SIZE_Z " ) , GetGroupSize ( ) . Z ) ;
}
} ;
2022-04-22 19:55:41 -04:00
IMPLEMENT_GLOBAL_SHADER ( FAllocatePagesCS , " /Engine/Private/DistanceField/GlobalDistanceField.usf " , " AllocatePagesCS " , SF_Compute ) ;
2020-09-08 17:44:06 -04:00
class FPageFreeListReturnIndirectArgBufferCS : public FGlobalShader
{
DECLARE_GLOBAL_SHADER ( FPageFreeListReturnIndirectArgBufferCS ) ;
SHADER_USE_PARAMETER_STRUCT ( FPageFreeListReturnIndirectArgBufferCS , FGlobalShader ) ;
BEGIN_SHADER_PARAMETER_STRUCT ( FParameters , )
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWBuffer < uint > , RWFreeListReturnIndirectArgBuffer )
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWStructuredBuffer < int , RWPageFreeListAllocatorBuffer )
SHADER_PARAMETER_RDG_BUFFER_SRV ( StructuredBuffer < uint > , PageFreeListReturnAllocatorBuffer )
END_SHADER_PARAMETER_STRUCT ( )
static bool ShouldCompilePermutation ( const FGlobalShaderPermutationParameters & Parameters )
{
2021-07-12 10:24:46 -04:00
return ShouldCompileDistanceFieldShaders ( Parameters . Platform ) ;
2020-09-08 17:44:06 -04:00
}
static void ModifyCompilationEnvironment ( const FGlobalShaderPermutationParameters & Parameters , FShaderCompilerEnvironment & OutEnvironment )
{
FGlobalShader : : ModifyCompilationEnvironment ( Parameters , OutEnvironment ) ;
OutEnvironment . SetDefine ( TEXT ( " THREADGROUP_SIZE_X " ) , 1 ) ;
OutEnvironment . SetDefine ( TEXT ( " THREADGROUP_SIZE_Y " ) , 1 ) ;
OutEnvironment . SetDefine ( TEXT ( " THREADGROUP_SIZE_Z " ) , 1 ) ;
}
} ;
2022-04-22 19:55:41 -04:00
IMPLEMENT_GLOBAL_SHADER ( FPageFreeListReturnIndirectArgBufferCS , " /Engine/Private/DistanceField/GlobalDistanceField.usf " , " PageFreeListReturnIndirectArgBufferCS " , SF_Compute ) ;
2020-09-08 17:44:06 -04:00
class FPageFreeListReturnCS : public FGlobalShader
{
DECLARE_GLOBAL_SHADER ( FPageFreeListReturnCS ) ;
SHADER_USE_PARAMETER_STRUCT ( FPageFreeListReturnCS , FGlobalShader ) ;
BEGIN_SHADER_PARAMETER_STRUCT ( FParameters , )
2021-03-17 06:01:59 -04:00
RDG_BUFFER_ACCESS ( FreeListReturnIndirectArgBuffer , ERHIAccess : : IndirectArgs )
2020-09-08 17:44:06 -04:00
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWStructuredBuffer < int , RWPageFreeListAllocatorBuffer )
SHADER_PARAMETER_RDG_BUFFER_UAV ( RWStructuredBuffer < uint > , RWPageFreeListBuffer )
SHADER_PARAMETER_RDG_BUFFER_SRV ( StructuredBuffer < uint > , PageFreeListReturnAllocatorBuffer )
SHADER_PARAMETER_RDG_BUFFER_SRV ( StructuredBuffer < uint > , PageFreeListReturnBuffer )
END_SHADER_PARAMETER_STRUCT ( )
static bool ShouldCompilePermutation ( const FGlobalShaderPermutationParameters & Parameters )
{
2021-07-12 10:24:46 -04:00
return ShouldCompileDistanceFieldShaders ( Parameters . Platform ) ;
2020-09-08 17:44:06 -04:00
}
static uint32 GetGroupSize ( )
{
return 64 ;
}
static void ModifyCompilationEnvironment ( const FGlobalShaderPermutationParameters & Parameters , FShaderCompilerEnvironment & OutEnvironment )
{
FGlobalShader : : ModifyCompilationEnvironment ( Parameters , OutEnvironment ) ;
OutEnvironment . SetDefine ( TEXT ( " THREADGROUP_SIZE_X " ) , GetGroupSize ( ) ) ;
OutEnvironment . SetDefine ( TEXT ( " THREADGROUP_SIZE_Y " ) , 1 ) ;
OutEnvironment . SetDefine ( TEXT ( " THREADGROUP_SIZE_Z " ) , 1 ) ;
}
} ;
2022-04-22 19:55:41 -04:00
IMPLEMENT_GLOBAL_SHADER ( FPageFreeListReturnCS , " /Engine/Private/DistanceField/GlobalDistanceField.usf " , " PageFreeListReturnCS " , SF_Compute ) ;
2020-07-06 18:58:26 -04:00
2020-09-15 11:03:59 -04:00
class FPropagateMipDistanceCS : public FGlobalShader
{
DECLARE_GLOBAL_SHADER ( FPropagateMipDistanceCS ) ;
SHADER_USE_PARAMETER_STRUCT ( FPropagateMipDistanceCS , FGlobalShader ) ;
BEGIN_SHADER_PARAMETER_STRUCT ( FParameters , )
SHADER_PARAMETER_STRUCT_REF ( FViewUniformShaderParameters , View )
SHADER_PARAMETER_RDG_TEXTURE_UAV ( RWTexture3D < float > , RWMipTexture )
SHADER_PARAMETER_RDG_TEXTURE ( Texture3D < float > , PrevMipTexture )
2022-02-08 16:53:35 -05:00
SHADER_PARAMETER_RDG_TEXTURE ( Texture3D < uint > , PageTableTexture )
2020-09-15 11:03:59 -04:00
SHADER_PARAMETER_RDG_TEXTURE ( Texture3D < float > , PageAtlasTexture )
2021-05-05 15:07:25 -04:00
SHADER_PARAMETER ( FVector3f , GlobalDistanceFieldInvPageAtlasSize )
2020-09-15 11:03:59 -04:00
SHADER_PARAMETER ( uint32 , GlobalDistanceFieldClipmapSizeInPages )
SHADER_PARAMETER ( uint32 , ClipmapMipResolution )
2022-02-24 20:40:01 -05:00
SHADER_PARAMETER ( float , OneOverClipmapMipResolution )
2020-09-15 11:03:59 -04:00
SHADER_PARAMETER ( uint32 , ClipmapIndex )
SHADER_PARAMETER ( uint32 , PrevClipmapOffsetZ )
SHADER_PARAMETER ( uint32 , ClipmapOffsetZ )
2021-05-05 15:07:25 -04:00
SHADER_PARAMETER ( FVector3f , ClipmapUVScrollOffset )
2020-09-15 11:03:59 -04:00
SHADER_PARAMETER ( float , CoarseDistanceFieldValueScale )
SHADER_PARAMETER ( float , CoarseDistanceFieldValueBias )
END_SHADER_PARAMETER_STRUCT ( )
class FReadPages : SHADER_PERMUTATION_BOOL ( " READ_PAGES " ) ;
using FPermutationDomain = TShaderPermutationDomain < FReadPages > ;
static bool ShouldCompilePermutation ( const FGlobalShaderPermutationParameters & Parameters )
{
2021-07-12 10:24:46 -04:00
return ShouldCompileDistanceFieldShaders ( Parameters . Platform ) ;
2020-09-15 11:03:59 -04:00
}
static FIntVector GetGroupSize ( )
{
return FIntVector ( 4 , 4 , 4 ) ;
}
static void ModifyCompilationEnvironment ( const FGlobalShaderPermutationParameters & Parameters , FShaderCompilerEnvironment & OutEnvironment )
{
FGlobalShader : : ModifyCompilationEnvironment ( Parameters , OutEnvironment ) ;
OutEnvironment . SetDefine ( TEXT ( " THREADGROUP_SIZE_X " ) , GetGroupSize ( ) . X ) ;
OutEnvironment . SetDefine ( TEXT ( " THREADGROUP_SIZE_Y " ) , GetGroupSize ( ) . Y ) ;
OutEnvironment . SetDefine ( TEXT ( " THREADGROUP_SIZE_Z " ) , GetGroupSize ( ) . Z ) ;
}
} ;
2022-04-22 19:55:41 -04:00
IMPLEMENT_GLOBAL_SHADER ( FPropagateMipDistanceCS , " /Engine/Private/DistanceField/GlobalDistanceFieldMip.usf " , " PropagateMipDistanceCS " , SF_Compute ) ;
class FGlobalDistanceFieldDebugCS : public FGlobalShader
{
DECLARE_GLOBAL_SHADER ( FGlobalDistanceFieldDebugCS )
SHADER_USE_PARAMETER_STRUCT ( FGlobalDistanceFieldDebugCS , FGlobalShader )
BEGIN_SHADER_PARAMETER_STRUCT ( FParameters , )
SHADER_PARAMETER_STRUCT_INCLUDE ( ShaderPrint : : FShaderParameters , ShaderPrintUniformBuffer )
SHADER_PARAMETER_RDG_BUFFER_SRV ( StructuredBuffer < uint > , GlobalDistanceFieldPageFreeListAllocatorBuffer )
SHADER_PARAMETER ( uint32 , GlobalDistanceFieldMaxPageNum )
END_SHADER_PARAMETER_STRUCT ( )
static bool ShouldCompilePermutation ( const FGlobalShaderPermutationParameters & Parameters )
{
return ShouldCompileDistanceFieldShaders ( Parameters . Platform ) ;
}
static int32 GetGroupSize ( )
{
return 4 ;
}
static void ModifyCompilationEnvironment ( const FGlobalShaderPermutationParameters & Parameters , FShaderCompilerEnvironment & OutEnvironment )
{
FGlobalShader : : ModifyCompilationEnvironment ( Parameters , OutEnvironment ) ;
OutEnvironment . SetDefine ( TEXT ( " THREADGROUP_SIZE " ) , GetGroupSize ( ) ) ;
}
} ;
IMPLEMENT_GLOBAL_SHADER ( FGlobalDistanceFieldDebugCS , " /Engine/Private/DistanceField/GlobalDistanceFieldDebug.usf " , " GlobalDistanceFieldDebugCS " , SF_Compute ) ;
2020-09-15 11:03:59 -04:00
2015-05-11 20:04:15 -04:00
/**
* Updates the global distance field for a view .
* Typically issues updates for just the newly exposed regions of the volume due to camera movement .
* In the worst case of a camera cut or large distance field scene changes , a full update of the global distance field will be done .
* */
void UpdateGlobalDistanceFieldVolume (
2020-10-27 13:40:36 -04:00
FRDGBuilder & GraphBuilder ,
FViewInfo & View ,
FScene * Scene ,
float MaxOcclusionDistance ,
2021-02-04 15:30:42 -04:00
bool bLumenEnabled ,
2015-05-11 20:04:15 -04:00
FGlobalDistanceFieldInfo & GlobalDistanceFieldInfo )
{
2021-07-13 12:38:37 -04:00
RDG_RHI_GPU_STAT_SCOPE ( GraphBuilder , GlobalDistanceFieldUpdate ) ;
2018-09-11 14:44:10 -04:00
2020-07-06 18:58:26 -04:00
const FDistanceFieldSceneData & DistanceFieldSceneData = Scene - > DistanceFieldSceneData ;
2021-02-04 15:30:42 -04:00
UpdateGlobalDistanceFieldViewOrigin ( View , bLumenEnabled ) ;
2018-09-11 14:44:10 -04:00
2022-01-31 04:59:02 -05:00
if ( DistanceFieldSceneData . NumObjectsInBuffer > 0 | | DistanceFieldSceneData . HeightfieldPrimitives . Num ( ) > 0 )
2015-05-11 20:04:15 -04:00
{
2022-04-22 19:55:41 -04:00
const int32 NumClipmaps = FMath : : Clamp < int32 > ( GetNumGlobalDistanceFieldClipmaps ( bLumenEnabled , View . FinalPostProcessSettings . LumenSceneViewDistance ) , 0 , GlobalDistanceField : : MaxClipmaps ) ;
2021-02-04 15:30:42 -04:00
ComputeUpdateRegionsAndUpdateViewState ( GraphBuilder . RHICmdList , View , Scene , GlobalDistanceFieldInfo , NumClipmaps , MaxOcclusionDistance , bLumenEnabled ) ;
2015-05-11 20:04:15 -04:00
2016-04-04 18:44:59 -04:00
// Recreate the view uniform buffer now that we have updated GlobalDistanceFieldInfo
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3091903)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3072947 on 2016/08/01 by Uriel.Doyon
Texture GUIDs are now included in cooked builds, as they are required by the texture streamer to link build data to in game texture.
#review-3072934 @marcus.wassmer
#jira UE-34045
Change 3073301 on 2016/08/02 by Ben.Woodhouse
Fix for large spotlight culling precision issues, reported on UDN by Aaron Jacobs at Double Fine. For a full description, see the UDN post
https://udn.unrealengine.com/questions/305440/shadowed-light-flicker-caused-by-floating-point-pr.html
#jira UE-34052
Change 3073689 on 2016/08/02 by Ben.Woodhouse
Improved skin postprocess - support for full resolution, with diffuse/spec lighting combined into single RGBA (sharing chroma)
Full res lighting gives less temporal AA flickering, sharper diffuse and specular lighting in the surface (since this is now at full resolution), faster postprocessing if using a 64-bit rendertarget (on NV 980Ti).
Checkerboard rendering is controlled via the r.sss.checkerboard cvar. - 0 is off/full res, 1 is checkerboard, 2 is automatic based on scenecolor (non-checkerboard requires 64bit or more rendertarget w/separate alpha)
Tested/profiled on PC, PS4
Change 3074666 on 2016/08/02 by Daniel.Wright
Fixed stationary skylight brightness
Change 3074667 on 2016/08/02 by Daniel.Wright
Fixed r.ReflectionEnvironmentLightmapMixing
Change 3074687 on 2016/08/02 by Daniel.Wright
Disallowed DrawMaterialToRenderTarget and Begin/EndDrawCanvasToRenderTarget in construction scripts, since they don't work in game. Blutilities can be used to do blueprint rendering in the editor.
Change 3075241 on 2016/08/03 by Rolando.Caloca
DR - Fix linux compile issue & static analysis warning
Change 3075746 on 2016/08/03 by Daniel.Wright
Removed bOverride_AntiAliasingMethod and outdated ini references to PP AntiAliasingMethod
Change 3075783 on 2016/08/03 by Ryan.Brucks
#code.review Marcus.Wassmer
Added two material nodes that return Atmospheric Light Vector and Light Direction using:
View.AtmosphericFogSunColor
View.AtmosphericFogSunDirection
Nodes are called:
AtmosphericLightVector
AtmosphericLightColor
Also changed SceneRendering.cpp so that values will be grabbed from directional lights without needing an Atmospheric Fog actor in the scene.
Change 3075969 on 2016/08/03 by Uriel.Doyon
Material GUIDs are not updated anymore when parents or textures change.
Lighting now uses a hash built from the list of parents, textures and shader functions.
#review-3072980 @marcus.wassmer @daniel.wright
Change 3076116 on 2016/08/03 by Ryan.Brucks
#code.review marcus.wassmer
Fixed typo in the Caption of new Nodes "Atmospheric Light Vector" and "Atmospheric Light Color"
Change 3076456 on 2016/08/03 by Rolando.Caloca
DR - Fix geometry shader gl_Layer for SPIR-V
Change 3076730 on 2016/08/03 by Uriel.Doyon
Added user warning logic for the texture streaming build. Ran in MapCheck, BeginPlay and PreSave.
#review-3072984 @marcus.wassmer
Change 3077616 on 2016/08/04 by Daniel.Wright
Planar reflection show flags can now be edited
Change 3077621 on 2016/08/04 by Daniel.Wright
Changed default Planar Reflection DistanceFromPlaneFadeoutEnd from 600 to 100, which reduces artifacts and is a more intuitive initial setting
Change 3077792 on 2016/08/04 by Daniel.Wright
Fixed an unnecessary sky capture caused by the sky light component owned by the default ASkyLight
Change 3077799 on 2016/08/04 by Daniel.Wright
Skip RF_ArchetypeObject for reflection captures
Change 3077876 on 2016/08/04 by Marc.Olano
Noise material perf improvements
Change random number generator for Gradient-ALU (1.7x perf boost), improve speed of Voronoi noise quality level 3.
Removes integer BBS random number generators. Fewer instructions, but too slow to use (see 1.7x perf boost above)
Change 3077884 on 2016/08/04 by Daniel.Wright
Lighting channels can now be edited on components with static mobility, since dynamic lights can still affect them
Change 3078994 on 2016/08/05 by Simon.Tovey
Fix for UE-34241
Scene proxy ptr was being cached during a downcast.
Inside a call to CreateDynamicData, CheckMaterialUsage_Concurrent() was causing the scene proxy to be recreated an so the cached ptr was stale.
I've fixed the immediate issue but recreating the scene proxy here doesn't seem great. Maybe CheckMaterailUsage() should be rethought a bit.
Change 3079162 on 2016/08/05 by Ben.Woodhouse
Fix for jittering in Paper2D. Was caused by override being ignored due to a change in intiialization order for AA settings.
#jira UE-34091
Change 3079613 on 2016/08/05 by Daniel.Wright
New blueprint function ClearRenderTarget2D, which is the only way to set a render target alpha directly
New blueprint function CreateRenderTarget2D
Change 3079708 on 2016/08/05 by Uriel.Doyon
Fixed crash when building texture streaming on some levels.
Change 3079795 on 2016/08/05 by Uriel.Doyon
Fixed issue with instanced static meshes when building texture streaming.
Fixed typo with func "GetNumTextureStreamingPrimitives"
Change 3079806 on 2016/08/05 by Uriel.Doyon
Enabled PerTexture MipBias. The per texture mip bias now resets to 0 when the texture gets required at low resolution.
New scalability setting named "r.Streaming.LimitPoolSizeToVRAM" enabling the PoolSize to be limited the available VRAM
(according to GPoolSizeVRAMPercentage)
#review-3074662 @marcus.wassmer
Change 3082698 on 2016/08/09 by Daniel.Wright
Copy - CreateRenderTarget2D uses a world context object as owner, allows use in a construction script
Change 3082699 on 2016/08/09 by Daniel.Wright
Changed display name for 'Two Sided' shading model to 'Two Sided Foliage' to make it clear what it's intended to be used for
Change 3083909 on 2016/08/10 by Olaf.Piesche
#jira UE-34106
#jira UE-32784
#jira UE-31198
Reset vertex factories on mesh emitters if mesh has been reimported (if mesh package is dirty)
Change 3084645 on 2016/08/10 by Olaf.Piesche
#jira UE-30398
Fix offset added to particle collision locations.
Change 3084709 on 2016/08/10 by Daniel.Wright
Copy - Scene capture alpha is now inverted to match DrawMaterialToRenderTarget, and to allow compositing with existing render target contents
Added CompositeMode to SceneCapture2D, which can be used to addively accumulate or composite instead of the default overwrite behavior
Added bCaptureOnMovement to SceneCapture, which can be disabled so the only source of scene capturing is a manual capture by calling CaptureScene()
Change 3084783 on 2016/08/10 by Rolando.Caloca
DR - Use the first targeted rhi shader platform as the initial RHI to load on Windows
#jira UE-34510
Change 3084958 on 2016/08/10 by Daniel.Wright
Copy - Reverted cl 2938543 "Lightmass now respects owner bHidden, and bCastHiddenShadow" because it did not have backwards compatibility so breaks content using hidden light cards
Change 3086023 on 2016/08/11 by Marcus.Wassmer
Merging //UE4/Dev-Main@3085468 to Dev-Rendering (//UE4/Dev-Rendering)
#test none
Change 3086778 on 2016/08/11 by Ben.Woodhouse
Workaround for fortnite character rendering issue. Enable checkerboard rendering by default until we can fix properly
#jira UE-34561
Change 3087404 on 2016/08/12 by Rolando.Caloca
DR - Upgrade glslang to 1.0.21.1
- Added some more debug output
Change 3087524 on 2016/08/12 by Rolando.Caloca
DR - vk - Fixed StencilRef, fixed size of RHIReadSurfaceFloatData (but still returns dummy data)
Change 3087663 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix for SRGB; support for mip texture views
Change 3087735 on 2016/08/12 by Daniel.Wright
TextureRenderTarget2D's can now be up to 8192^2. Anything over 2048 pops up an 'are you sure' dialog.
Change 3087750 on 2016/08/12 by Rolando.Caloca
DR - vk - Minor renaming in prep for merge
Change 3087813 on 2016/08/12 by Rolando.Caloca
DR - vk - More minor cleanup
Change 3087819 on 2016/08/12 by Chris.Bunner
Check material function input types directly, no need to traverse connected graph.
#jira UE-32134
Change 3087901 on 2016/08/12 by Rolando.Caloca
DR - vk - Fix RT view to use 1 mip
Fix depth buffer component swizzle
Change 3088193 on 2016/08/12 by Daniel.Wright
DFAO and RTDF shadows are enabled in High and Epic scalability settings by default
Change 3088988 on 2016/08/15 by Rolando.Caloca
DR - Add Accessors
Change 3089104 on 2016/08/15 by Olaf.Piesche
#jira UE-34241
Sceneproxy can be nullptr in FDynamicMeshEmitterData::Init if the proxy is being recreated
Change 3089208 on 2016/08/15 by Daniel.Wright
Downsampled separate translucency uses a separate view uniform buffer with correct buffer sizes
* Fixes WorldPosition in downsampled translucency
* View uniform buffer parameters are now cached on the view, to allow recreating the uniform buffer without having to rebuild the entire struct. Currently used by global distance field, downsampled separate translucency.
* Fixed the downsampled translucency depth buffer being full res used together with a smaller color target, now they are both the downsampled res
Change 3089209 on 2016/08/15 by Daniel.Wright
Fixed atmospheric fog on translucency
Change 3089457 on 2016/08/15 by Daniel.Wright
Fixed lighting build failure from UMaterialInstanceDynamic assigned to a mesh that's being exported to Lightmass. The Swarm cache entry is created using the parent's guid, causing multiple MID's with the same parent to acquire a file handle multiple times which fails after the first.
Change 3089549 on 2016/08/15 by Daniel.Wright
UMaterialInterface initializes LightingGuid to something valid - causes UMaterialInstanceDynamic to have a valid LightingGuid so they can be used in lighting builds
Change 3089703 on 2016/08/15 by Daniel.Wright
Custom expression fixup for View.RenderTargetSize
Change 3090546 on 2016/08/16 by Daniel.Wright
Hopeful fix for recycled snapshot view crash
Change 3091202 on 2016/08/16 by Daniel.Wright
Manually clear FViewInfo::CachedViewUniformShaderParameters on creating a snapshot, since memcpy is used to create the snapshot view
[CL 3091931 by Gil Gribb in Main branch]
2016-08-17 11:38:13 -04:00
View . SetupGlobalDistanceFieldUniformBufferParameters ( * View . CachedViewUniformShaderParameters ) ;
View . ViewUniformBuffer = TUniformBufferRef < FViewUniformShaderParameters > : : CreateUniformBufferImmediate ( * View . CachedViewUniformShaderParameters , UniformBuffer_SingleFrame ) ;
2016-04-04 18:44:59 -04:00
2020-09-08 17:44:06 -04:00
bool bHasUpdateBounds = false ;
2015-05-11 20:04:15 -04:00
for ( int32 ClipmapIndex = 0 ; ClipmapIndex < GlobalDistanceFieldInfo . Clipmaps . Num ( ) ; ClipmapIndex + + )
{
2020-09-08 17:44:06 -04:00
bHasUpdateBounds = bHasUpdateBounds | | GlobalDistanceFieldInfo . Clipmaps [ ClipmapIndex ] . UpdateBounds . Num ( ) > 0 ;
2015-05-11 20:04:15 -04:00
}
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
for ( int32 ClipmapIndex = 0 ; ClipmapIndex < GlobalDistanceFieldInfo . MostlyStaticClipmaps . Num ( ) ; ClipmapIndex + + )
{
2020-09-08 17:44:06 -04:00
bHasUpdateBounds = bHasUpdateBounds | | GlobalDistanceFieldInfo . MostlyStaticClipmaps [ ClipmapIndex ] . UpdateBounds . Num ( ) > 0 ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
}
2022-02-10 09:28:01 -05:00
if ( bHasUpdateBounds )
2015-05-11 20:04:15 -04:00
{
2020-07-06 18:58:26 -04:00
RDG_EVENT_SCOPE ( GraphBuilder , " UpdateGlobalDistanceField " ) ;
2015-05-11 20:04:15 -04:00
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
const FGlobalDFCacheType StartCacheType = GAOGlobalDistanceFieldCacheMostlyStaticSeparately ? GDF_MostlyStatic : GDF_Full ;
2020-09-08 17:44:06 -04:00
FRDGBufferRef PageFreeListAllocatorBuffer = nullptr ;
if ( GlobalDistanceFieldInfo . PageFreeListAllocatorBuffer )
{
PageFreeListAllocatorBuffer = GraphBuilder . RegisterExternalBuffer ( GlobalDistanceFieldInfo . PageFreeListAllocatorBuffer , TEXT ( " PageFreeListAllocator " ) ) ;
}
FRDGBufferRef PageFreeListBuffer = nullptr ;
if ( GlobalDistanceFieldInfo . PageFreeListBuffer )
{
PageFreeListBuffer = GraphBuilder . RegisterExternalBuffer ( GlobalDistanceFieldInfo . PageFreeListBuffer , TEXT ( " PageFreeList " ) ) ;
}
FRDGTextureRef PageAtlasTexture = nullptr ;
if ( GlobalDistanceFieldInfo . PageAtlasTexture )
{
PageAtlasTexture = GraphBuilder . RegisterExternalTexture ( GlobalDistanceFieldInfo . PageAtlasTexture , TEXT ( " PageAtlas " ) ) ;
}
2022-03-01 21:07:45 -05:00
FRDGTextureRef CoverageAtlasTexture = nullptr ;
if ( GlobalDistanceFieldInfo . CoverageAtlasTexture )
{
CoverageAtlasTexture = GraphBuilder . RegisterExternalTexture ( GlobalDistanceFieldInfo . CoverageAtlasTexture , TEXT ( " CoverageAtlas " ) ) ;
}
2020-09-08 17:44:06 -04:00
FRDGTextureRef PageTableCombinedTexture = nullptr ;
if ( GlobalDistanceFieldInfo . PageTableCombinedTexture )
{
PageTableCombinedTexture = GraphBuilder . RegisterExternalTexture ( GlobalDistanceFieldInfo . PageTableCombinedTexture , TEXT ( " PageTableCombined " ) ) ;
}
2020-09-15 11:03:59 -04:00
FRDGTextureRef MipTexture = nullptr ;
if ( GlobalDistanceFieldInfo . MipTexture )
{
MipTexture = GraphBuilder . RegisterExternalTexture ( GlobalDistanceFieldInfo . MipTexture , TEXT ( " GlobalSDFMips " ) ) ;
}
FRDGTextureRef TempMipTexture = nullptr ;
{
2021-02-04 15:30:42 -04:00
const int32 ClipmapMipResolution = GlobalDistanceField : : GetClipmapMipResolution ( bLumenEnabled ) ;
2020-09-24 00:43:27 -04:00
FRDGTextureDesc TempMipDesc ( FRDGTextureDesc : : Create3D (
FIntVector ( ClipmapMipResolution ) ,
2020-09-15 11:03:59 -04:00
PF_R8 ,
FClearValueBinding : : Black ,
2020-09-24 00:43:27 -04:00
TexCreate_ShaderResource | TexCreate_UAV | TexCreate_3DTiling ) ) ;
2020-09-15 11:03:59 -04:00
TempMipTexture = GraphBuilder . CreateTexture ( TempMipDesc , TEXT ( " TempMip " ) ) ;
}
2020-09-08 17:44:06 -04:00
FRDGTextureRef PageTableLayerTextures [ GDF_Num ] = { } ;
2020-07-06 18:58:26 -04:00
for ( int32 CacheType = StartCacheType ; CacheType < GDF_Num ; CacheType + + )
{
2020-09-08 17:44:06 -04:00
if ( GlobalDistanceFieldInfo . PageTableLayerTextures [ CacheType ] )
2020-07-06 18:58:26 -04:00
{
2020-09-08 17:44:06 -04:00
PageTableLayerTextures [ CacheType ] = GraphBuilder . RegisterExternalTexture ( GlobalDistanceFieldInfo . PageTableLayerTextures [ CacheType ] , TEXT ( " GlobalDistanceFieldPageTableLayer " ) ) ;
2020-07-06 18:58:26 -04:00
}
}
2020-09-08 17:44:06 -04:00
if ( View . ViewState & & View . ViewState - > bGlobalDistanceFieldPendingReset )
{
// Reset all allocators to default
const uint32 PageTableClearValue [ 4 ] = { 0xFFFFFFFF , 0xFFFFFFFF , 0xFFFFFFFF , 0xFFFFFFFF } ;
if ( PageTableCombinedTexture )
{
AddClearUAVPass ( GraphBuilder , GraphBuilder . CreateUAV ( PageTableCombinedTexture ) , PageTableClearValue ) ;
}
for ( int32 CacheType = StartCacheType ; CacheType < GDF_Num ; + + CacheType )
{
if ( PageTableLayerTextures [ CacheType ] )
{
AddClearUAVPass ( GraphBuilder , GraphBuilder . CreateUAV ( PageTableLayerTextures [ CacheType ] ) , PageTableClearValue ) ;
}
}
2022-01-26 17:07:27 -05:00
const int32 MaxPageNum = GlobalDistanceField : : GetMaxPageNum ( bLumenEnabled , View . FinalPostProcessSettings . LumenSceneViewDistance ) ;
2020-09-08 17:44:06 -04:00
if ( PageFreeListAllocatorBuffer )
{
FInitPageFreeListCS : : FParameters * PassParameters = GraphBuilder . AllocParameters < FInitPageFreeListCS : : FParameters > ( ) ;
PassParameters - > RWPageFreeListBuffer = GraphBuilder . CreateUAV ( PageFreeListBuffer , PF_R32_UINT ) ;
PassParameters - > RWPageFreeListAllocatorBuffer = GraphBuilder . CreateUAV ( PageFreeListAllocatorBuffer , PF_R32_SINT ) ;
PassParameters - > GlobalDistanceFieldMaxPageNum = MaxPageNum ;
auto ComputeShader = View . ShaderMap - > GetShader < FInitPageFreeListCS > ( ) ;
const FIntVector GroupSize = FComputeShaderUtils : : GetGroupCount ( MaxPageNum , FInitPageFreeListCS : : GetGroupSize ( ) ) ;
FComputeShaderUtils : : AddPass (
GraphBuilder ,
RDG_EVENT_NAME ( " InitPageFreeList " ) ,
ComputeShader ,
PassParameters ,
GroupSize ) ;
}
View . ViewState - > bGlobalDistanceFieldPendingReset = false ;
}
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
for ( int32 CacheType = StartCacheType ; CacheType < GDF_Num ; CacheType + + )
2015-05-11 20:04:15 -04:00
{
2020-09-08 17:44:06 -04:00
FRDGTextureRef PageTableLayerTexture = PageTableLayerTextures [ CacheType ] ;
FRDGTextureRef ParentPageTableLayerTexture = nullptr ;
if ( CacheType = = GDF_Full & & GAOGlobalDistanceFieldCacheMostlyStaticSeparately & & PageTableLayerTextures [ GDF_MostlyStatic ] )
{
ParentPageTableLayerTexture = PageTableLayerTextures [ GDF_MostlyStatic ] ;
}
2020-07-06 18:58:26 -04:00
TArray < FGlobalDistanceFieldClipmap > & Clipmaps = CacheType = = GDF_MostlyStatic
? GlobalDistanceFieldInfo . MostlyStaticClipmaps
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
: GlobalDistanceFieldInfo . Clipmaps ;
2015-05-11 20:04:15 -04:00
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
for ( int32 ClipmapIndex = 0 ; ClipmapIndex < Clipmaps . Num ( ) ; ClipmapIndex + + )
2015-05-11 20:04:15 -04:00
{
2020-07-06 18:58:26 -04:00
RDG_EVENT_SCOPE ( GraphBuilder , " Clipmap:%d CacheType:%s " , ClipmapIndex , CacheType = = GDF_MostlyStatic ? TEXT ( " MostlyStatic " ) : TEXT ( " Movable " ) ) ;
2015-05-11 20:04:15 -04:00
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
FGlobalDistanceFieldClipmap & Clipmap = Clipmaps [ ClipmapIndex ] ;
2021-02-04 15:30:42 -04:00
const int32 ClipmapResolution = GlobalDistanceField : : GetClipmapResolution ( bLumenEnabled ) ;
2020-09-08 17:44:06 -04:00
const FVector ClipmapWorldCenter = Clipmap . Bounds . GetCenter ( ) ;
const FVector ClipmapWorldExtent = Clipmap . Bounds . GetExtent ( ) ;
const FVector ClipmapSize = Clipmap . Bounds . GetSize ( ) ;
const FVector ClipmapVoxelSize = ClipmapSize / FVector ( ClipmapResolution ) ;
const FVector ClipmapVoxelExtent = 0.5f * ClipmapVoxelSize ;
const float ClipmapVoxelRadius = ClipmapVoxelExtent . Size ( ) ;
const float ClipmapInfluenceRadius = ( GGlobalDistanceFieldInfluenceRangeInVoxels * ClipmapSize . X ) / ClipmapResolution ;
2020-07-06 18:58:26 -04:00
2021-09-22 10:01:48 -04:00
FVector4f ClipmapVolumeWorldToUVAddAndMul ;
2020-09-08 17:44:06 -04:00
const FVector WorldToUVAdd = ( Clipmap . ScrollOffset - Clipmap . Bounds . GetCenter ( ) ) / ( Clipmap . Bounds . GetExtent ( ) . X * 2.0f ) + FVector ( 0.5f ) ;
2022-02-02 07:59:31 -05:00
ClipmapVolumeWorldToUVAddAndMul = FVector4f ( ( FVector3f ) WorldToUVAdd , 1.0f / ( Clipmap . Bounds . GetExtent ( ) . X * 2.0f ) ) ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
2020-07-06 18:58:26 -04:00
int32 MaxSDFMeshObjects = FMath : : RoundUpToPowerOfTwo ( DistanceFieldSceneData . NumObjectsInBuffer ) ;
2020-09-08 17:44:06 -04:00
FRDGBufferRef ObjectIndexBuffer = GraphBuilder . CreateBuffer ( FRDGBufferDesc : : CreateStructuredDesc ( sizeof ( uint32 ) , MaxSDFMeshObjects ) , TEXT ( " ObjectIndices " ) ) ;
FRDGBufferRef ObjectIndexNumBuffer = GraphBuilder . CreateBuffer ( FRDGBufferDesc : : CreateStructuredDesc ( sizeof ( uint32 ) , 1 ) , TEXT ( " ObjectIndexNum " ) ) ;
2020-07-06 18:58:26 -04:00
// Upload update bounds data
FRDGBufferRef UpdateBoundsBuffer = nullptr ;
uint32 NumUpdateBounds = 0 ;
{
const uint32 BufferStrideInFloat4 = 2 ;
2021-09-22 10:01:48 -04:00
const uint32 BufferStride = BufferStrideInFloat4 * sizeof ( FVector4f ) ;
2021-06-09 17:18:09 -04:00
2021-09-22 10:01:48 -04:00
FRDGUploadData < FVector4f > UpdateBoundsData ( GraphBuilder , BufferStrideInFloat4 * Clipmap . UpdateBounds . Num ( ) ) ;
2020-07-06 18:58:26 -04:00
for ( int32 UpdateBoundsIndex = 0 ; UpdateBoundsIndex < Clipmap . UpdateBounds . Num ( ) ; + + UpdateBoundsIndex )
2015-05-11 20:04:15 -04:00
{
2020-07-06 18:58:26 -04:00
const FClipmapUpdateBounds & UpdateBounds = Clipmap . UpdateBounds [ UpdateBoundsIndex ] ;
2022-02-02 07:59:31 -05:00
UpdateBoundsData [ NumUpdateBounds * BufferStrideInFloat4 + 0 ] = FVector4f ( ( FVector3f ) UpdateBounds . Center , UpdateBounds . bExpandByInfluenceRadius ? 1.0f : 0.0f ) ;
UpdateBoundsData [ NumUpdateBounds * BufferStrideInFloat4 + 1 ] = FVector4f ( ( FVector3f ) UpdateBounds . Extent , 0.0f ) ;
2020-07-06 18:58:26 -04:00
+ + NumUpdateBounds ;
}
check ( UpdateBoundsData . Num ( ) % BufferStrideInFloat4 = = 0 ) ;
2021-06-09 17:18:09 -04:00
UpdateBoundsBuffer =
CreateUploadBuffer ( GraphBuilder , TEXT ( " UpdateBoundsBuffer " ) ,
2021-09-22 10:01:48 -04:00
sizeof ( FVector4f ) , FMath : : RoundUpToPowerOfTwo ( FMath : : Max ( UpdateBoundsData . Num ( ) , 2 ) ) ,
2021-06-09 17:18:09 -04:00
UpdateBoundsData ) ;
2020-07-06 18:58:26 -04:00
}
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
2020-09-08 17:44:06 -04:00
FHeightfieldDescription UpdateRegionHeightfield ;
// Update heightfield descriptors
{
const int32 NumHeightfieldPrimitives = DistanceFieldSceneData . HeightfieldPrimitives . Num ( ) ;
if ( ( CacheType = = GDF_MostlyStatic | | ! GAOGlobalDistanceFieldCacheMostlyStaticSeparately )
& & NumUpdateBounds > 0
& & NumHeightfieldPrimitives > 0
& & GAOGlobalDistanceFieldRepresentHeightfields
& & SupportsDistanceFieldAO ( Scene - > GetFeatureLevel ( ) , Scene - > GetShaderPlatform ( ) )
& & ! IsVulkanMobileSM5Platform ( Scene - > GetShaderPlatform ( ) ) )
{
for ( int32 HeightfieldPrimitiveIndex = 0 ; HeightfieldPrimitiveIndex < NumHeightfieldPrimitives ; HeightfieldPrimitiveIndex + + )
{
2022-02-24 18:42:28 -05:00
const FPrimitiveSceneProxy * HeightfieldPrimitiveProxy = Scene - > DistanceFieldSceneData . HeightfieldPrimitives [ HeightfieldPrimitiveIndex ] - > Proxy ;
const FBoxSphereBounds & PrimitiveBounds = HeightfieldPrimitiveProxy - > GetBounds ( ) ;
if ( HeightfieldPrimitiveProxy - > HeightfieldHasPendingStreaming ( ) )
{
continue ;
}
2020-09-08 17:44:06 -04:00
// Expand bounding box by a SDF max influence distance (only in local Z axis, as distance is computed from a top down projected heightmap point).
2022-02-24 18:42:28 -05:00
const FVector QueryInfluenceExpand = HeightfieldPrimitiveProxy - > GetLocalToWorld ( ) . GetUnitAxis ( EAxis : : Z ) * FVector ( 0.0f , 0.0f , ClipmapInfluenceRadius ) ;
2020-09-08 17:44:06 -04:00
const FBox HeightfieldInfluenceBox = PrimitiveBounds . GetBox ( ) . ExpandBy ( QueryInfluenceExpand , QueryInfluenceExpand ) ;
if ( Clipmap . Bounds . Intersect ( HeightfieldInfluenceBox ) )
{
UTexture2D * HeightfieldTexture = nullptr ;
UTexture2D * DiffuseColorTexture = nullptr ;
UTexture2D * VisibilityTexture = nullptr ;
2022-02-24 18:42:28 -05:00
FHeightfieldComponentDescription NewComponentDescription ( HeightfieldPrimitiveProxy - > GetLocalToWorld ( ) ) ;
HeightfieldPrimitiveProxy - > GetHeightfieldRepresentation ( HeightfieldTexture , DiffuseColorTexture , VisibilityTexture , NewComponentDescription ) ;
2020-09-08 17:44:06 -04:00
2021-05-14 07:17:32 -04:00
if ( HeightfieldTexture & & HeightfieldTexture - > GetResource ( ) & & HeightfieldTexture - > GetResource ( ) - > TextureRHI )
2020-09-08 17:44:06 -04:00
{
TArray < FHeightfieldComponentDescription > & ComponentDescriptions = UpdateRegionHeightfield . ComponentDescriptions . FindOrAdd ( FHeightfieldComponentTextures ( HeightfieldTexture , DiffuseColorTexture , VisibilityTexture ) ) ;
ComponentDescriptions . Add ( NewComponentDescription ) ;
}
}
}
}
}
if ( NumUpdateBounds > 0 & & PageAtlasTexture )
2020-07-06 18:58:26 -04:00
{
// Cull the global objects to the update regions
2022-01-31 04:59:02 -05:00
if ( Scene - > DistanceFieldSceneData . NumObjectsInBuffer > 0 )
2020-07-06 18:58:26 -04:00
{
uint32 AcceptOftenMovingObjectsOnlyValue = 0 ;
2020-06-23 18:40:00 -04:00
2020-07-06 18:58:26 -04:00
if ( ! GAOGlobalDistanceFieldCacheMostlyStaticSeparately )
{
AcceptOftenMovingObjectsOnlyValue = 2 ;
}
else if ( CacheType = = GDF_Full )
{
// First cache is for mostly static, second contains both, inheriting static objects distance fields with a lookup
// So only composite often moving objects into the full global distance field
AcceptOftenMovingObjectsOnlyValue = 1 ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3185985)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3170391 on 2016/10/21 by Ben.Woodhouse
Remove the wait on end of frame ensure, because we can't rely on all the the underlying codepaths to never miss a call to flush RHI resources. The consequences of missing a flush on a given frame are not serious now, since we enforce the synchronisation with a fence, preventing the rendering thread from getting too far ahead. We will simply accumulate resources for an additional frame when this happens.
#jira UE-37437
#fyi rolando.caloca, marcus.wassmer
Change 3170659 on 2016/10/21 by Rolando.Caloca
DR - vk - Prep work for state key changes
Change 3170676 on 2016/10/21 by Rolando.Caloca
DR - vk - Reworked blend state keys
- Added depth/stencil to pipeline key
Change 3170848 on 2016/10/21 by Daniel.Wright
Level viewport 'show stats' option is now enabled by default, which avoids confusion with artists thinking lighting is built, when really the message is hidden.
Change 3170849 on 2016/10/21 by Daniel.Wright
Split FProjectedShadowInfo::RenderProjection into smaller functions which make the algorithm structure clear
Change 3170995 on 2016/10/21 by Rolando.Caloca
DR - vk - Show object on vulkan validation msgs
Change 3171085 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix pipelines being used with incompatible renderpasses
Change 3171159 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix layout when reading textures on CPU
Change 3171167 on 2016/10/21 by Rolando.Caloca
DR - vk - compile fix
Change 3172462 on 2016/10/24 by Daniel.Wright
Added a warning about shader compile times to the material tooltip
Change 3172463 on 2016/10/24 by Daniel.Wright
Reduced MinUnoccludedFraction to avoid artitfacts when a stationary light touches only a tiny part of a mesh
Change 3172716 on 2016/10/24 by Brian.Karis
Fix for crash UE-37369 when reimporting over a generated LOD.
Change 3172967 on 2016/10/24 by Rolando.Caloca
DR - vk - Fix writing buffers while GPU was using them
Change 3174187 on 2016/10/25 by Olaf.Piesche
UE-37020
Change 3174718 on 2016/10/26 by Rolando.Caloca
DR - vk - Remove old timestamp queries, increase occlusion queries per pool to 4k
Change 3175960 on 2016/10/26 by Rolando.Caloca
DR - Added support for hlslcc header to have custom parsing
Change 3176611 on 2016/10/27 by David.Hill
DrawWireCone confusion:
In response to a UDN, I'm updating confusing parameter names and comments for
DrawWireCone() and DrawWireSphereCappedCone()
Change 3177111 on 2016/10/27 by Rolando.Caloca
DR - vk - Fix timestamps for frame
Change 3177192 on 2016/10/27 by Arne.Schober
DR - DitherLOD refactor - moved computation of the DepthStencil state out of SetMeshRenderState into GetDitheredLODTransitionState this is a prerequisite of further PSO work where we want to move up State setting in a similar war and reuse FMeshDrawingRenderState
Change 3177278 on 2016/10/27 by Olaf.Piesche
UE-37484
Change 3177297 on 2016/10/27 by Rolando.Caloca
DR - vk - Enable GRHISupportsBaseVertexIndex
Change 3177607 on 2016/10/27 by Rolando.Caloca
DR - vk - SM4 UB prep
Change 3178052 on 2016/10/28 by Arne.Schober
DR - fix WebGL - the WebGL compiler is very picky on double underscores and does want the presission to be defined before any function definition.
Change 3178156 on 2016/10/28 by Rolando.Caloca
DR - vk - Added query timer
- Fixed inline issues
Change 3178158 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for out of stencil bits
Change 3178462 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for Elemental
Change 3179131 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix for r.Vulkan.UseRealUBs
Change 3179139 on 2016/10/28 by Rolando.Caloca
DR - vk - Move UB ring buffer to context
Change 3179145 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix buffer barriers
Change 3179888 on 2016/10/31 by Rolando.Caloca
DR - vk - Align buffers to 16 bytes as we sometimes write to them with SIMD
Change 3179923 on 2016/10/31 by Rolando.Caloca
DR - vk - Wait for swapchain counter
Change 3180430 on 2016/10/31 by Rolando.Caloca
DR - vk - Properly wait for occlusion queries/cmd buffer
- Actual log error if trying to use occlusion queries out of order
Change 3180746 on 2016/10/31 by Rolando.Caloca
DR - vk - Undo some waiting as it was on the wrong thread
Change 3182115 on 2016/11/01 by Rolando.Caloca
DR - hlslcc Linux path fix
Change 3182118 on 2016/11/01 by Daniel.Wright
Fixed global distance field seam artifacts from landscapes with no subsections
Change 3182368 on 2016/11/01 by Daniel.Wright
Dynamic Indirect Shadows for static meshes using distance fields
* These Distance Field indirect shadows use the same tile culled and downsampled framework that capsule shadows use, with similar GPU cost
* Individual StaticMesh assets can enable bGenerateMeshDistanceField to compute a distance field, without the memory cost of enabling for the whole project
* New StaticMeshComponent properties bCastDynamicIndirectShadow and DynamicIndirectShadowMinVisibility
* New WorldSettings property DynamicIndirectShadowsSelfShadowingIntensity which replaces the cvar
* The GBuffer now stores HasDynamicIndirectShadowCasterRepresentation instead of HasHeightfieldRepresentation
* DFAO from landscape is now done through the global distance field entirely. Landscape contribution to the global distance field is deferred to attempt to workaround texture streaming issues.
Change 3182408 on 2016/11/01 by Rolando.Caloca
DR - vk - Reworked occlusion queries, fixes flickering on AMD
Change 3182585 on 2016/11/01 by Daniel.Wright
PS4 compile fix
Change 3183151 on 2016/11/02 by Rolando.Caloca
DR - vk - Fix issue when processing super quick cmd buffers
Change 3183160 on 2016/11/02 by Rolando.Caloca
Dr - vk - Call reset queries outside render pass
Change 3183182 on 2016/11/02 by Rolando.Caloca
DR - Switch clear
Change 3183194 on 2016/11/02 by Rolando.Caloca
DR - Try to catch crash ahead of time
Change 3183268 on 2016/11/02 by Rolando.Caloca
DR - vk - Rename RenderPassState to TransitionState
Change 3183440 on 2016/11/02 by Daniel.Wright
Renamed 'Dynamic Indirect Shadow' to 'Distance Field Indirect Shadow'
Change 3183793 on 2016/11/02 by Daniel.Wright
Added ShadowResolutionScale to lightcomponent
Change 3183796 on 2016/11/02 by Daniel.Wright
Improved bSimulatePhysics comment, with info on why it might be greyed out
Change 3183797 on 2016/11/02 by Daniel.Wright
Precomputed shadowmaps no longer enable Force2To1Aspect, which is only needed for lightmaps. Improves shadowmap utilization.
Change 3183915 on 2016/11/02 by Rolando.Caloca
DR - vk - Remove redundant renderpasses
Change 3183991 on 2016/11/02 by Daniel.Wright
Added r.ReflectionEnvironmentLightmapMixLargestWeight, useful for restricting lightmap mixing to darkening only
Change 3184001 on 2016/11/02 by Daniel.Wright
Better draw event for IndirectCapsuleShadows in stereo
Change 3184096 on 2016/11/02 by Chris.Bunner
HDR for D3D11 - NVAPI toggle and encoding, UI compositing.
Removed some outdated tonemamping cvars and modes.
Change 3184399 on 2016/11/02 by Daniel.Wright
Static analysis workaround
Change 3184455 on 2016/11/02 by Mark.Satterthwaite
Fix missing log10 from FCompositePS on hlslcc shader platforms so that QA can continue their integration.
#jira UE-38164
Change 3184953 on 2016/11/03 by Chris.Bunner
Fixing CIS warnings.
[CL 3186011 by Marcus Wassmer in Main branch]
2016-11-03 16:55:27 -04:00
}
2020-07-06 18:58:26 -04:00
AddClearUAVPass ( GraphBuilder , GraphBuilder . CreateUAV ( ObjectIndexNumBuffer , PF_R32_UINT ) , 0 ) ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3461187)
#lockdown Nick.Penwarden
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3388286 on 2017/04/11 by Chris.Bunner
Fix mips in texture GnmUAV creation.
Change 3388287 on 2017/04/11 by Chris.Bunner
Improved PS/CS code sharing for TemporalAA.
Change 3388291 on 2017/04/11 by Chris.Bunner
HLODs now correctly hide their children in shadow maps.
Propagate bCastFarShadow flag on HLOD generation.
#jira UE-42254
Change 3388448 on 2017/04/11 by Brian.Karis
Better handle divide by zero
Change 3388449 on 2017/04/11 by Brian.Karis
Optimizations to shading model math.
PR #3340: Bug fixes related to shader TODOs (Contributed by vgfx)
Change 3388455 on 2017/04/11 by Uriel.Doyon
Changed Remove for RemoveSwap when clearing dynamic component references
Change 3388612 on 2017/04/11 by Simon.Tourangeau
Support shaders in projects and in plugins
When searching for a shader it will
- First look in Engine/Shaders as usual
- Then in project's Shader folder
- Then in all enabled plugin Shader folders
Project or plugin must be loaded in PostConfigInit phase
Tested in PIE, engine (cooked, packaged)
Change 3388819 on 2017/04/11 by Arne.Schober
DR - Faster MorpthTarget implementation. Changed the previous Gather aproach to a scatter based one. Reaching about 110GB/s on PS4 which is up to 4x faster than the previous implementation. On PC DX11 the impact is lower due to unecessary UAV barriers but still 2x faster on AMD and up to 6x faster on Nvidia Hardware.
#RB Lina.Halper, Rolando.Caloca
Change 3388862 on 2017/04/11 by Guillaume.Abadie
Allows Motion Blur and TAA in scene capture 2d.
Change 3388953 on 2017/04/11 by Uriel.Doyon
Fixed issue where lights from hidden levels where affecting the lighting build, by checking if the light is registered before adding it.
#UE-43220
Change 3389138 on 2017/04/11 by Arne.Schober
DR - Fix crash when opening a Level without Contentbrowser open.
#RB Matt.Kuhlenschmidt
Change 3389400 on 2017/04/11 by Uriel.Doyon
- Renamed FMaterialResource::IsSeparateTranslucencyEnabled() to FMaterialResource::IsTranslucencyAfterDOFEnabled()
- Removed different logic to determine if translucency after DOF was enabled, and centralized it into a single function: FSceneViewFamily::AllowTranslucencyAfterDOF()
- FSceneRenderTargets::FinishRenderingSeparateTranslucency() now only resolves a single view, allowing better Begin/Finish scopes.
- Renamed FSceneRenderTargets::SeparateTranslucencyDepthRT into FSceneRenderTargets::DownsampledTranslucencyDepthRT since this one is only allocated when rendering in downsampled mode.
- Standard translucency is now rendered in the same resolution than translucency after DOF. (downsampled or full resolution)
- Removed RenderTranslucencyParallel and merged it's logic into RenderTranslucency. Renamed DrawAllTranslucencyPasses to RenderViewTranslucency and added a parallel version RenderViewTranslucencyParallel.
- Moved all debug draw logic (VisualizeLPV, ViewMeshElements and SimpleElementCollector) to a common place.
- New option "r.AllowDownsampledStandardTranslucency" to control the downsampling of standard translucency. Affect blend module materials
#jira UE-39505
Change 3389860 on 2017/04/12 by Richard.Wallis
UE-41407 Cable actor does not render correctly in viewport on Mac.
Build the mesh at creation time - call into exisiting mesh create function.
Change 3390933 on 2017/04/12 by Arne.Schober
DR - potential fix for UE-43125 where the this pointer might get invalidated in the middle of the function
#RB Marcus.Wassmer
Change 3391010 on 2017/04/12 by Ben.Marsh
Compile UE4Game non-unity for Mac as part of nightly builds in //UE4/Dev-Rendering.
Change 3391412 on 2017/04/12 by Uriel.Doyon
Mesh Decals are now sorted according to the component TranslucencySortPriority.
#jira UE-43053
Change 3392117 on 2017/04/13 by Guillaume.Abadie
Integrates Raven's experimental PCSS for cascaded shadow map hidden behind a CVar.
Change 3392179 on 2017/04/13 by Guillaume.Abadie
Attempts to fix linux compilation by removing mistakenly submitted dead code.
Change 3392231 on 2017/04/13 by Guillaume.Abadie
Fixes a wrong enum value real quick in FRenderingObjectVersion I introduced after main integration... Oups...
Change 3393879 on 2017/04/14 by Guillaume.Abadie
Attempts to fix linux compilation warning.
Change 3393881 on 2017/04/14 by Guillaume.Abadie
Back out changelist 3393879
Change 3393882 on 2017/04/14 by Guillaume.Abadie
Attempts #2 to fix linux compilation error.
Change 3394100 on 2017/04/14 by Chris.Bunner
Corrected material shared sampler usage with mip-biasing.
Change 3394174 on 2017/04/14 by Rolando.Caloca
DR - Change ensure to warning
Change 3394221 on 2017/04/14 by Marcus.Wassmer
Fix poseable mesh bounds calculation.
Change 3396238 on 2017/04/17 by David.Hill
Fix Bloom with LensFlare
Duplicating fix - will also fix directly in 4.16
#jira 44050
Change 3397055 on 2017/04/17 by Joe.Graf
Fixed Windows specific assumptions in Slate File Dialog Window's file filtering that lead to crashes
#CodeReview: matt.kuhlenschmidt
#rb: n/a
Change 3397921 on 2017/04/18 by Joe.Graf
Rewrote SlateFileDlgWindow's file filtering to allow for extensionless file selection and to remove the O(n^2) file filtering
#CodeReview: arciel.rekman, matt.kuhlenschmidt
#rb: n/a
Change 3398406 on 2017/04/18 by Rolando.Caloca
DR - Fix shaders in plugins on Mac
Change 3399546 on 2017/04/19 by Benjamin.Hyder
Updating content for test levels (HDR, Bloom_FFT, DistanceFields_IndirectShadows)
Change 3399725 on 2017/04/19 by Guillaume.Abadie
Avoids compiling PCSS shaders for SM4.
Change 3400295 on 2017/04/19 by Michael.Trepka
Fixed metal shader compile errors in MorphTargets.usf
Change 3400457 on 2017/04/19 by Michael.Trepka
Merged Rolando's shader fixes
Change 3400473 on 2017/04/19 by Arne.Schober
DR - provide Aftermath Reason when init failed.
#RB none
Change 3400699 on 2017/04/19 by Arne.Schober
DR - Fixed Text macro
#RB none
Change 3402280 on 2017/04/20 by Simon.Tovey
Minor cascade fix
#tests no crash
#jira UE-41560
Change 3402517 on 2017/04/20 by Arne.Schober
DR - Fix static analysis warning
#RB none
Change 3403897 on 2017/04/21 by Arne.Schober
DR - [UE-43898] - Someone missed a shaderversion bump which poisoned the DCC
#RB None
#jira UE-43898
Change 3404591 on 2017/04/21 by Olaf.Piesche
#jira UE-41979
Should never be crashing there, unless the mesh is changed after Init of the effect instance; this change safeguards against the number of mesh sections (and hence materials) changing after creation of the dynamic data to avoid the crash.
Change 3407451 on 2017/04/25 by Daniel.Wright
Fixed Indirect Lighting Cache updates caused by capsule indirect shadows forcing point samples, breaking primitives using ILCQ_Volume
Change 3407452 on 2017/04/25 by Daniel.Wright
Added r.AOJitterConeDirections, although disabled by default because it requires the temporal filter to be much stronger
Change 3408397 on 2017/04/25 by Daniel.Wright
ViewFamily.bRealtimeUpdate is set to false if Slate is throttling (like when toggling show flags). Volumetric fog discards the temporal history when not realtime, so you can see changes immediately.
Change 3408428 on 2017/04/25 by Daniel.Wright
Changed 'r.AOMaxObjectsPerCullTile' default back to 512 as 256 causes artifacts with RTDF shadows
Change 3409764 on 2017/04/26 by Daniel.Wright
Force dumping shader debug info for Global shaders when r.ShaderDevelopmentMode is enabled. Most of the shaders you want to look at in a GPU capture are global shaders, and global shaders create few debug files. 'recompileshaders global' time 35s -> 38s for SM5.
Change 3411659 on 2017/04/27 by Daniel.Wright
[Copy] Set Xbox One engine default screen percentage to 83.33 (1600x900), as ESRAM choices are dependent on this
Change 3411660 on 2017/04/27 by Daniel.Wright
[Copy] Global distance field composite shader has a version for each flattened axis, which improves efficiency when updating a slab which is what camera movement typically causes
Change 3411667 on 2017/04/27 by Daniel.Wright
[Copy] Discard distance field AO history buffer if it doesn't match the new buffer size. This prevents reading uinitialized data after a scene render target resize.
Change 3411668 on 2017/04/27 by Daniel.Wright
[Copy] Better indirect capsule shadow draw event info
Change 3411669 on 2017/04/27 by Daniel.Wright
[Copy] Pass down FeatureLevel to AddSubjectPrimitive and GatherShadowsForPrimitiveInner instead of calling the scene's virtual function. Showed up prominently in a sampling profile.
Change 3411755 on 2017/04/27 by Daniel.Wright
[Copy] Occlusion queries are now always done before the base pass if a nearly full prepass is being used(DDM_AllOccluders or DDM_AllOpaque)
* Removed r.OcclusionQueryLocation
Change 3411827 on 2017/04/27 by Daniel.Wright
[Copy] Much cheaper implementation of IsForwardShadingEnabled which showed up prominently in sampling profiles - inlined function and no more unnecessary thread safety overhead
Change 3411829 on 2017/04/27 by Daniel.Wright
Added an ensure to console manager when doing FindTConsoleVariableData* on a FAutoConsoleVariableRef
Change 3411837 on 2017/04/27 by Daniel.Wright
[Copy] Worked around slow memcpy's being used to sort FSortedLightSceneInfo
Change 3411838 on 2017/04/27 by Daniel.Wright
[Copy] Skip tracking MaterialRenderProxyMap on cooked platforms
Change 3411843 on 2017/04/27 by Daniel.Wright
[Copy] Fixed r.ParallelShadows on PS4 and enabled by default engine-wide (saves 5ms RT with CSM)
* Gnm was not tracking DepthClearValue when a depth target was set but not cleared
* Gnm has a bug where TargetsNeedingEliminateFastClear does not persist across commandlist breaks. Moved FinishRenderingGBuffer before RenderShadowDepthMaps to workaround (accidentally not in this changelist)
* Shadow depth rendering was not using BindClearMRTValues to populate GNM parallel commandlist TargetsNeedingEliminateFastClear values
Change 3411873 on 2017/04/27 by Daniel.Wright
[Copy] Deferred uniform expression caching. Setting multiple parameters on a material only causes its uniform expressions to be recached once.
* 280 calls to CacheUniformExpressions -> 120 during Fortnite combat (6.5ms -> 3.4ms)
Change 3411891 on 2017/04/27 by Daniel.Wright
[Copy] GatherShadowPrimitives optimizations
* Total GatherShadowPrimivies went from 2.3ms -> 1.3ms on PS4 with these changes in GPUPerfTest (duplicated 3x)
* Much flatter primitive octree (16 -> 256 max primitives)
* Primitives are culled against the shadow frustum before FPrimitiveSceneInfo or FPrimitiveSceneProxy are dereferenced in FilterPrimitiveForShadows
* FilterPrimitiveForShadows work is done in a ParallelFor. Primitive octree nodes are processed in different jobs.
* StaticMeshWholeSceneShadowBatchVisibility now only stores entries for meshes with bRequiresPerElementVisibility (landscape). Previously it was allocating and zeroing 500Kb 3x per frame (main view + 2 cascades) which cost ~.8ms on PS4.
Change 3412192 on 2017/04/27 by Michael.Trepka
Fixed Clang compile errors in FortniteGame, partial copy of CL 3313426
Change 3412547 on 2017/04/27 by Daniel.Wright
Fixed leak of FShadowMapAllocation and FLightMapAllocation's found by licensee
Change 3414239 on 2017/04/28 by Arne.Schober
DR - UE-44500 - Removed use of Structured Buffer from MorphTargets due to HLSLCC not supporting it.
#RB none
#jira UE-44500
Change 3414754 on 2017/04/28 by Daniel.Wright
Added VolumetricFogEmissive to ExponentialHeightFogComponent
* Volumetric fog does not yet support precomputed lighting, so this is the only way to get an ambient lighting term
Change 3416859 on 2017/05/01 by Arne.Schober
DR - Remove FeatureLevel from the Clear Functions to reduce area of error
#RB Rolando.Caloca
Change 3420750 on 2017/05/03 by Arne.Schober
DR - [UE-44497] - Fix several PS4 validation layer issues
#RB Marcus.Wassmer
Change 3422869 on 2017/05/04 by Benjamin.Hyder
Fix compile error from merge.
Change 3423938 on 2017/05/04 by Marc.Olano
[UE-44453] Fix bloom problems by moving saturate after vector math
Change 3424494 on 2017/05/04 by Olaf.Piesche
#jira UE-44589
When using FindTCosoleVariableData, the CVar can not be an FAutoConsoleVariable.
#tests as described in jira ticket
Change 3424754 on 2017/05/04 by Uriel.Doyon
Fixed call to get texture compressor module outside the main thread.
#jira UE-42168
Change 3425447 on 2017/05/05 by Uriel.Doyon
#buildfix
Change 3427042 on 2017/05/05 by Arne.Schober
DR - Fix one of my typos
#RB none
Change 3428119 on 2017/05/08 by Marcus.Wassmer
Fix UE-44733
static analysis warning.
Change 3428222 on 2017/05/08 by Uriel.Doyon
Fixed bad condition in translucency rendering
#jira UE-44452
Change 3429794 on 2017/05/08 by Uriel.Doyon
Fixed issues with lightshafts and low res translucency.
#jira UE-44452
Change 3430921 on 2017/05/09 by Rolando.Caloca
DR - Get additional function pointers for D3DReflect, Compile and Disassemble instructions from the same DLL when compiling D3D11 shaders.
- Also fixes using the correct fxc.exe path to match the DLL we distribute.
Change 3431156 on 2017/05/09 by Rolando.Caloca
DR - Remove unused code
Change 3431396 on 2017/05/09 by David.Hill
Copy of changes made directly in 4.16 ( CL 341037 )
to be submitted to dev-rendering
#jira UE-44641
Change 3431400 on 2017/05/09 by Rolando.Caloca
DR - Fix typo
Change 3431527 on 2017/05/09 by David.Hill
#rb: none
Oops.
comment out r.ShaderDevelopmentMode =1
Change 3431590 on 2017/05/09 by Daniel.Wright
Removed early return landmine in USceneCaptureComponent2D::Serialize
Change 3431591 on 2017/05/09 by Daniel.Wright
Disallow map building while in PIE, or PIE while buildling lighting
Change 3431594 on 2017/05/09 by Daniel.Wright
Added RenderTargetFormat to UTextureRenderTarget2D, with choices of 8 bit, 16fp, 32fp and 1, 2 or 4 channels.
Change 3431667 on 2017/05/09 by Daniel.Wright
Volumetric fog now supersamples lighting when the history is not available, reducing noise on areas that just came on-screen or after a camera cut.
* The number of samples is controlled by r.VolumetricFog.HistoryMissSupersampleCount, defaults to 4, cinematic scalability uses 16
* Under fast camera movement, volumetric fog cost went from 1.79ms -> 1.97ms with 4 samples, on a 970GTX
Change 3432366 on 2017/05/10 by Richard.Wallis
Fix for MetalRHI Asserts When Using "Profile GPU" With RHI-Thread/Parallel-Execution. Don't insert events when not in RHIThread or the actual single-threaded-render thread.
#jira UE-36006
Change 3432367 on 2017/05/10 by Richard.Wallis
Fix for Metal ReStartRenderPass assert with profiling. macOS metal asserts when using "profileGPU" even with -norhithread argument set.
Added no action to the allowed render pass restart store actions for the depth buffer avoiding the assert. Interested to know the details if this is not a valid assumption to make - throwing away the depth buffer after a render pass I think would be a common case.
#jira UE-44322
Change 3432409 on 2017/05/10 by Richard.Wallis
Merged across CL 3415890 from Release-4.16 fix for (jira UE-43895)
Fix for deferred store actions getting cleared when we don't have a valid render target.
Change 3432833 on 2017/05/10 by Daniel.Wright
Fixed Ocean compile error
Change 3432874 on 2017/05/10 by Marc.Olano
Improved captions for Noise and VectorNoise material nodes
Change 3432947 on 2017/05/10 by Richard.Wallis
Fix for shared Material Native Shader Libraries Don't Function With Iterative Cooking. Keep latest versions of shader byte code in native shared material packaged build in an intermediate directory than can be reused on a later iterative cook.
- Doesn't handle deletion of the intermediate directory contents. Assumed to be a higher level requirement on non iterative cook flag.
#jira UE-44657
Change 3433484 on 2017/05/10 by Arne.Schober
DR - UE-44393 - Move ShaderPlatform into TShaderMap for extra debuginformation when it fails to find a proper shader. Also log when Gobalshaders are verified and recompiled.
#jira UE-44393
#RB Daniel.Wright
Change 3433515 on 2017/05/10 by Arne.Schober
DR - Fix a bug where recompileshaders changed while compiling causes a crash where the chached local vertex factories are mutated while been used.
#RB Daniel.Wright
Change 3433606 on 2017/05/10 by Daniel.Wright
Fixed static shadowing of volumetric fog and translucency causing shadowing past the lightmass importance volume.
Change 3433619 on 2017/05/10 by Daniel.Wright
Skip recapturing reflection captures when PropagateLightingScenarioChange is being called for a level unload. This leaves stale results in reflection captures around when hiding a level in the editor, but avoids the double recapture that happens when swapping lighting scenarios in game, and the unnecessary reflection capture update when exiting PIE.
Change 3433795 on 2017/05/10 by Arne.Schober
DR - add cmdline to select a GPU vendor when multiple GPUs from differnt Vendors are installed into the same Machine
#RB marcus.Wassmer
Change 3433941 on 2017/05/10 by Daniel.Wright
Cone vs tile bounding sphere intersection tests for Light Grid culling of spotlights, which provides much tighter culling than just View space tile AABB vs light bounding sphere.
* Forward shading BasePass 3.7ms -> 2.4ms in a scene with 24 spotlights on 970GTX
* Volumetric fog 2.87ms -> 2.09ms in the same scene
Change 3435139 on 2017/05/11 by Daniel.Wright
Restored GTextureRenderTarget2DMaxSizeX which is used by Ocean
Change 3435297 on 2017/05/11 by Arne.Schober
DR - Remove manual AlignOf and use C++11 keyword instead
#RB Steve.Robb
Change 3435367 on 2017/05/11 by Daniel.Wright
Circle vertex buffer for slightly tighter voxelization of volumetric fog shadowed lights
* 1.5ms -> 1.38ms on 970 GTX with 24 spotlights
Change 3435522 on 2017/05/11 by Brian.Karis
Dither opacity mask now stacks properly for non parallel polys. Dither is randomized by triangle normal.
Change 3436063 on 2017/05/11 by Daniel.Wright
Disabled CLB_AggressiveBatching for PC d3d12 as it causes flickering artifacts in lighting
Change 3436269 on 2017/05/11 by Uriel.Doyon
Fixed UVChannel data possibly not up-to-date depending on user manips.
Change 3436611 on 2017/05/12 by Simon.Tovey
Improved name and tooltip for static mesh property controlling generation of alias tables for uniform sampling.
Change 3436676 on 2017/05/12 by Simon.Tovey
Fix for fixed bounds being "invalid" unless set via the toolbar option.
Change 3436700 on 2017/05/12 by Simon.Tovey
Crash fix.
Issue found in https://udn.unrealengine.com/questions/355944/crash-in-fdynamicspriteemitterdatagetdynamicmeshel.html
Particle proxies would have stale material resource pointers if the material is changed while the system was invisible.
If the old material is freed during this time, the next time the system renders it will crash.
Change 3437367 on 2017/05/12 by Brian.Karis
Fixed bug with small UV charts not packing.
Change 3437860 on 2017/05/12 by Arne.Schober
DR - Fix alignment compile error in win32 where according to ABI alignment is 4 for int64
#RB none
Change 3437972 on 2017/05/12 by Arne.Schober
DR - Fix alignment compile error in win32 where according to ABI function calls cannot take alingned structures. In all of the cases the copy was completely unnecessary.
#RB none
Change 3437975 on 2017/05/12 by Chris.Bunner
Added calculation for MaterialParamsEx to MeshDecals.usf.
#jira UE-43052
Change 3438109 on 2017/05/12 by Rolando.Caloca
DR - Support for -nomcpp on SCW
Change 3438889 on 2017/05/15 by Chris.Bunner
Nullptr check in a few material uniform expressions.
Change 3439351 on 2017/05/15 by Chris.Bunner
Added tooltip to Power material expression.
Change 3439763 on 2017/05/15 by Daniel.Wright
Apply passed in DistanceBiasSqr to line lights - allows volumetric fog to reduce aliasing on line lights
Change 3439764 on 2017/05/15 by Daniel.Wright
Fixed order of operations with bTreatMaxDepthUnshadowed - manifested as unfiltered static shadow depth lookups
Change 3440722 on 2017/05/16 by Guillaume.Abadie
Exposes Scene capture's FOV to blueprints
Change 3441680 on 2017/05/16 by Uriel.Doyon
Added units to point light intensity, to allow the user to specify the value in candelas or lumens.
New point light actors now configure the intensity in candelas by default.
Replaced viewport exposure settings by an EV100 slider.
Hidding the tone mapper in the show flag now still applies the exposure.
Added a new AutoExposure method called EV100 which allows to specify :
- MinEV100, MaxEV100
- Calibration Constnat
- Exposure Compensation
#jira UE-42783
Change 3441884 on 2017/05/16 by Uriel.Doyon
Fixed StreamingDistanceMultiplier not being applied to the texture streaming data.
Change 3442800 on 2017/05/17 by Gil.Gribb
Fixed botched merge.
Change 3442896 on 2017/05/17 by Gil.Gribb
UE4 - Allowed the possibility of running the RHI "thread" on task threads instead and cleaned up and unified the conditionals involved. By default we still have a dedicated RHI thread because it tested slightly faster.
Change 3443951 on 2017/05/17 by Richard.Wallis
Added Apple override allocator macro - each command encoder type needs it's own allocator queue.
Change 3444787 on 2017/05/17 by Daniel.Wright
Fixed DBuffer decal default normal (used when DBuffer decals enabled, but not decals rendered) not reconstructing zero properly, adding -.008 to WorldNormal which then caused artifacts with forward lighting specular on materials with roughness near 0.
Change 3444882 on 2017/05/17 by Daniel.Wright
Added comment to FClearValueBinding::DefaultNormal8Bit to make the dependency on shader decode clear
Change 3444883 on 2017/05/17 by Brian.Karis
Improved contact shadows
Change 3445048 on 2017/05/17 by Daniel.Wright
Fixed particle lights in forward shading, they were not setting the lighting channel mask properly
Change 3445107 on 2017/05/17 by Michael.Trepka
Changed the order of operations in FMetalStateCache::SetRenderState to work around an issue with some Intel drivers where they would not recalculate the raster state in some edge cases.
#jira UE-43725
Change 3445212 on 2017/05/17 by Uriel.Doyon
Added a -CSV option to ListTextures command
Change 3445947 on 2017/05/18 by Richard.Wallis
Clone of Release-4.16 Stream CL 3437181 and CL 3442450 - fix(s) for black rendering on macOS El Cap with Nvidia GPU. Move sampling of EyeAdaption texture to pixel shader for Mac Metal using shader language version <= 1 only.
Change 3446545 on 2017/05/18 by Chris.Bunner
Removed hardcoded (and unused) MRT write from Decal shaders.
#jira UE-45095
Change 3446568 on 2017/05/18 by Marc.Olano
Sobol and image-based importance sampling C++ functions and blueprint nodes
Change 3446988 on 2017/05/18 by Marc.Olano
Fix build error: missing include
Change 3446990 on 2017/05/18 by Marc.Olano
Cell-indexed Sobol sampling for shaders (in MonteCarlo.usf) and materials (Sobol and TemporalSobol nodes)
Change 3447142 on 2017/05/18 by Rolando.Caloca
DR - RWLock instead of mutex for PSO cache
Change 3447144 on 2017/05/18 by Uriel.Doyon
Moved shading model code to SetGBufferFromShadingModel(). This allows the code to be reused in other shader files.
Change 3447794 on 2017/05/18 by Brian.Karis
Virtual texturing foundation code
Change 3448944 on 2017/05/19 by Arciel.Rekman
Fix non-unity Linux (and Mac, etc) builds.
- Mac fix is tentative, did not try.
Change 3449183 on 2017/05/19 by Marcus.Wassmer
Duplicate fix for reflection captures to happen after sequencer updates.
Change 3449196 on 2017/05/19 by Uriel.Doyon
Handling RCM_MinMax when reading FloatRGBA textures.
This fixes pixel inspector always reading 1 for scene color values greater than one.
Change 3451652 on 2017/05/22 by Rolando.Caloca
DR - Compile fix
#jira UE-45245
Change 3451660 on 2017/05/22 by Chris.Bunner
Additional compile fix.
#jira UE-45245
Change 3451897 on 2017/05/22 by Daniel.Wright
Moved RTDF shadow project back after the base pass, since it samples the GBuffer for subsurface shadowing. Removed r.DFShadowAsyncCompute which was relying on the previous ordering.
Change 3452055 on 2017/05/22 by Rolando.Caloca
DR - Switch compile fix
#jira UE-45265
Change 3452089 on 2017/05/22 by Rolando.Caloca
DR - Compile fix
#jira UE-45246
Change 3452108 on 2017/05/22 by Rolando.Caloca
DR - Compile fix
#jira UE-45246
Change 3452179 on 2017/05/22 by Brian.Karis
Exposed dimensions. Fixed static analysis.
Change 3452734 on 2017/05/22 by Daniel.Wright
When post processing is disabled, TPT_TranslucencyAfterDOF translucency gets forced into the standard translucency pass.
Change 3452770 on 2017/05/22 by Daniel.Wright
Static light source shapes drawn into reflection captures handle SourceLength via scaled sphere
Change 3452861 on 2017/05/22 by Rolando.Caloca
DR - Switch compile fix
Change 3452952 on 2017/05/22 by Brian.Karis
Small VT fixes
Change 3453647 on 2017/05/23 by Richard.Wallis
Fix for tessellation shaders on Mac (Metal v1.2) failing to compile.
#jira UE-45227
Change 3454844 on 2017/05/23 by Uriel.Doyon
Fixed extra X16 on some point lights
#jira UE-45250
Change 3454934 on 2017/05/23 by Chris.Bunner
Backing out changelists 3441680, 3454636 and 3454844 for the sake of integration stability.
Change 3457131 on 2017/05/24 by Arne.Schober
DR - [UE-45317] - Fix Depthbuffer not available for resolve in Forward mode
#jira UE-45317
#RB Chris.Bunner
Change 3457141 on 2017/05/24 by Marc.Olano
Sobol bug fixes
Change 3457953 on 2017/05/24 by Brian.Karis
Fix static analysis
#jira UE-45315
#jira UE-45314
#jira UE-45313
Change 3459064 on 2017/05/25 by Chris.Bunner
Fix for out of bounds material translation crash.
#jira UE-45406
Change 3459700 on 2017/05/25 by Brian.Karis
Revert using sprite index buffer because the vert order is different.
Change 3459847 on 2017/05/25 by Chris.Bunner
Fixing ensure in RenderTestMap.
[CL 3461201 by Chris Bunner in Main branch]
2017-05-26 08:22:50 -04:00
2020-07-06 18:58:26 -04:00
FCullObjectsToClipmapCS : : FParameters * PassParameters = GraphBuilder . AllocParameters < FCullObjectsToClipmapCS : : FParameters > ( ) ;
PassParameters - > RWObjectIndexBuffer = GraphBuilder . CreateUAV ( ObjectIndexBuffer , PF_R32_UINT ) ;
PassParameters - > RWObjectIndexNumBuffer = GraphBuilder . CreateUAV ( ObjectIndexNumBuffer , PF_R32_UINT ) ;
2022-04-26 14:37:07 -04:00
PassParameters - > DistanceFieldObjectBuffers = DistanceField : : SetupObjectBufferParameters ( GraphBuilder , DistanceFieldSceneData ) ;
2022-02-02 07:59:31 -05:00
PassParameters - > ClipmapWorldCenter = ( FVector3f ) Clipmap . Bounds . GetCenter ( ) ;
PassParameters - > ClipmapWorldExtent = ( FVector3f ) Clipmap . Bounds . GetExtent ( ) ;
2020-07-06 18:58:26 -04:00
PassParameters - > AcceptOftenMovingObjectsOnly = AcceptOftenMovingObjectsOnlyValue ;
2022-01-26 17:07:27 -05:00
const float RadiusThresholdScale = bLumenEnabled ? 1.0f / FMath : : Clamp ( View . FinalPostProcessSettings . LumenSceneDetail , .01f , 100.0f ) : 1.0f ;
PassParameters - > MeshSDFRadiusThreshold = GetMinMeshSDFRadius ( ClipmapVoxelSize . X ) * RadiusThresholdScale ;
2020-09-08 17:44:06 -04:00
PassParameters - > InfluenceRadiusSq = ClipmapInfluenceRadius * ClipmapInfluenceRadius ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3461187)
#lockdown Nick.Penwarden
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3388286 on 2017/04/11 by Chris.Bunner
Fix mips in texture GnmUAV creation.
Change 3388287 on 2017/04/11 by Chris.Bunner
Improved PS/CS code sharing for TemporalAA.
Change 3388291 on 2017/04/11 by Chris.Bunner
HLODs now correctly hide their children in shadow maps.
Propagate bCastFarShadow flag on HLOD generation.
#jira UE-42254
Change 3388448 on 2017/04/11 by Brian.Karis
Better handle divide by zero
Change 3388449 on 2017/04/11 by Brian.Karis
Optimizations to shading model math.
PR #3340: Bug fixes related to shader TODOs (Contributed by vgfx)
Change 3388455 on 2017/04/11 by Uriel.Doyon
Changed Remove for RemoveSwap when clearing dynamic component references
Change 3388612 on 2017/04/11 by Simon.Tourangeau
Support shaders in projects and in plugins
When searching for a shader it will
- First look in Engine/Shaders as usual
- Then in project's Shader folder
- Then in all enabled plugin Shader folders
Project or plugin must be loaded in PostConfigInit phase
Tested in PIE, engine (cooked, packaged)
Change 3388819 on 2017/04/11 by Arne.Schober
DR - Faster MorpthTarget implementation. Changed the previous Gather aproach to a scatter based one. Reaching about 110GB/s on PS4 which is up to 4x faster than the previous implementation. On PC DX11 the impact is lower due to unecessary UAV barriers but still 2x faster on AMD and up to 6x faster on Nvidia Hardware.
#RB Lina.Halper, Rolando.Caloca
Change 3388862 on 2017/04/11 by Guillaume.Abadie
Allows Motion Blur and TAA in scene capture 2d.
Change 3388953 on 2017/04/11 by Uriel.Doyon
Fixed issue where lights from hidden levels where affecting the lighting build, by checking if the light is registered before adding it.
#UE-43220
Change 3389138 on 2017/04/11 by Arne.Schober
DR - Fix crash when opening a Level without Contentbrowser open.
#RB Matt.Kuhlenschmidt
Change 3389400 on 2017/04/11 by Uriel.Doyon
- Renamed FMaterialResource::IsSeparateTranslucencyEnabled() to FMaterialResource::IsTranslucencyAfterDOFEnabled()
- Removed different logic to determine if translucency after DOF was enabled, and centralized it into a single function: FSceneViewFamily::AllowTranslucencyAfterDOF()
- FSceneRenderTargets::FinishRenderingSeparateTranslucency() now only resolves a single view, allowing better Begin/Finish scopes.
- Renamed FSceneRenderTargets::SeparateTranslucencyDepthRT into FSceneRenderTargets::DownsampledTranslucencyDepthRT since this one is only allocated when rendering in downsampled mode.
- Standard translucency is now rendered in the same resolution than translucency after DOF. (downsampled or full resolution)
- Removed RenderTranslucencyParallel and merged it's logic into RenderTranslucency. Renamed DrawAllTranslucencyPasses to RenderViewTranslucency and added a parallel version RenderViewTranslucencyParallel.
- Moved all debug draw logic (VisualizeLPV, ViewMeshElements and SimpleElementCollector) to a common place.
- New option "r.AllowDownsampledStandardTranslucency" to control the downsampling of standard translucency. Affect blend module materials
#jira UE-39505
Change 3389860 on 2017/04/12 by Richard.Wallis
UE-41407 Cable actor does not render correctly in viewport on Mac.
Build the mesh at creation time - call into exisiting mesh create function.
Change 3390933 on 2017/04/12 by Arne.Schober
DR - potential fix for UE-43125 where the this pointer might get invalidated in the middle of the function
#RB Marcus.Wassmer
Change 3391010 on 2017/04/12 by Ben.Marsh
Compile UE4Game non-unity for Mac as part of nightly builds in //UE4/Dev-Rendering.
Change 3391412 on 2017/04/12 by Uriel.Doyon
Mesh Decals are now sorted according to the component TranslucencySortPriority.
#jira UE-43053
Change 3392117 on 2017/04/13 by Guillaume.Abadie
Integrates Raven's experimental PCSS for cascaded shadow map hidden behind a CVar.
Change 3392179 on 2017/04/13 by Guillaume.Abadie
Attempts to fix linux compilation by removing mistakenly submitted dead code.
Change 3392231 on 2017/04/13 by Guillaume.Abadie
Fixes a wrong enum value real quick in FRenderingObjectVersion I introduced after main integration... Oups...
Change 3393879 on 2017/04/14 by Guillaume.Abadie
Attempts to fix linux compilation warning.
Change 3393881 on 2017/04/14 by Guillaume.Abadie
Back out changelist 3393879
Change 3393882 on 2017/04/14 by Guillaume.Abadie
Attempts #2 to fix linux compilation error.
Change 3394100 on 2017/04/14 by Chris.Bunner
Corrected material shared sampler usage with mip-biasing.
Change 3394174 on 2017/04/14 by Rolando.Caloca
DR - Change ensure to warning
Change 3394221 on 2017/04/14 by Marcus.Wassmer
Fix poseable mesh bounds calculation.
Change 3396238 on 2017/04/17 by David.Hill
Fix Bloom with LensFlare
Duplicating fix - will also fix directly in 4.16
#jira 44050
Change 3397055 on 2017/04/17 by Joe.Graf
Fixed Windows specific assumptions in Slate File Dialog Window's file filtering that lead to crashes
#CodeReview: matt.kuhlenschmidt
#rb: n/a
Change 3397921 on 2017/04/18 by Joe.Graf
Rewrote SlateFileDlgWindow's file filtering to allow for extensionless file selection and to remove the O(n^2) file filtering
#CodeReview: arciel.rekman, matt.kuhlenschmidt
#rb: n/a
Change 3398406 on 2017/04/18 by Rolando.Caloca
DR - Fix shaders in plugins on Mac
Change 3399546 on 2017/04/19 by Benjamin.Hyder
Updating content for test levels (HDR, Bloom_FFT, DistanceFields_IndirectShadows)
Change 3399725 on 2017/04/19 by Guillaume.Abadie
Avoids compiling PCSS shaders for SM4.
Change 3400295 on 2017/04/19 by Michael.Trepka
Fixed metal shader compile errors in MorphTargets.usf
Change 3400457 on 2017/04/19 by Michael.Trepka
Merged Rolando's shader fixes
Change 3400473 on 2017/04/19 by Arne.Schober
DR - provide Aftermath Reason when init failed.
#RB none
Change 3400699 on 2017/04/19 by Arne.Schober
DR - Fixed Text macro
#RB none
Change 3402280 on 2017/04/20 by Simon.Tovey
Minor cascade fix
#tests no crash
#jira UE-41560
Change 3402517 on 2017/04/20 by Arne.Schober
DR - Fix static analysis warning
#RB none
Change 3403897 on 2017/04/21 by Arne.Schober
DR - [UE-43898] - Someone missed a shaderversion bump which poisoned the DCC
#RB None
#jira UE-43898
Change 3404591 on 2017/04/21 by Olaf.Piesche
#jira UE-41979
Should never be crashing there, unless the mesh is changed after Init of the effect instance; this change safeguards against the number of mesh sections (and hence materials) changing after creation of the dynamic data to avoid the crash.
Change 3407451 on 2017/04/25 by Daniel.Wright
Fixed Indirect Lighting Cache updates caused by capsule indirect shadows forcing point samples, breaking primitives using ILCQ_Volume
Change 3407452 on 2017/04/25 by Daniel.Wright
Added r.AOJitterConeDirections, although disabled by default because it requires the temporal filter to be much stronger
Change 3408397 on 2017/04/25 by Daniel.Wright
ViewFamily.bRealtimeUpdate is set to false if Slate is throttling (like when toggling show flags). Volumetric fog discards the temporal history when not realtime, so you can see changes immediately.
Change 3408428 on 2017/04/25 by Daniel.Wright
Changed 'r.AOMaxObjectsPerCullTile' default back to 512 as 256 causes artifacts with RTDF shadows
Change 3409764 on 2017/04/26 by Daniel.Wright
Force dumping shader debug info for Global shaders when r.ShaderDevelopmentMode is enabled. Most of the shaders you want to look at in a GPU capture are global shaders, and global shaders create few debug files. 'recompileshaders global' time 35s -> 38s for SM5.
Change 3411659 on 2017/04/27 by Daniel.Wright
[Copy] Set Xbox One engine default screen percentage to 83.33 (1600x900), as ESRAM choices are dependent on this
Change 3411660 on 2017/04/27 by Daniel.Wright
[Copy] Global distance field composite shader has a version for each flattened axis, which improves efficiency when updating a slab which is what camera movement typically causes
Change 3411667 on 2017/04/27 by Daniel.Wright
[Copy] Discard distance field AO history buffer if it doesn't match the new buffer size. This prevents reading uinitialized data after a scene render target resize.
Change 3411668 on 2017/04/27 by Daniel.Wright
[Copy] Better indirect capsule shadow draw event info
Change 3411669 on 2017/04/27 by Daniel.Wright
[Copy] Pass down FeatureLevel to AddSubjectPrimitive and GatherShadowsForPrimitiveInner instead of calling the scene's virtual function. Showed up prominently in a sampling profile.
Change 3411755 on 2017/04/27 by Daniel.Wright
[Copy] Occlusion queries are now always done before the base pass if a nearly full prepass is being used(DDM_AllOccluders or DDM_AllOpaque)
* Removed r.OcclusionQueryLocation
Change 3411827 on 2017/04/27 by Daniel.Wright
[Copy] Much cheaper implementation of IsForwardShadingEnabled which showed up prominently in sampling profiles - inlined function and no more unnecessary thread safety overhead
Change 3411829 on 2017/04/27 by Daniel.Wright
Added an ensure to console manager when doing FindTConsoleVariableData* on a FAutoConsoleVariableRef
Change 3411837 on 2017/04/27 by Daniel.Wright
[Copy] Worked around slow memcpy's being used to sort FSortedLightSceneInfo
Change 3411838 on 2017/04/27 by Daniel.Wright
[Copy] Skip tracking MaterialRenderProxyMap on cooked platforms
Change 3411843 on 2017/04/27 by Daniel.Wright
[Copy] Fixed r.ParallelShadows on PS4 and enabled by default engine-wide (saves 5ms RT with CSM)
* Gnm was not tracking DepthClearValue when a depth target was set but not cleared
* Gnm has a bug where TargetsNeedingEliminateFastClear does not persist across commandlist breaks. Moved FinishRenderingGBuffer before RenderShadowDepthMaps to workaround (accidentally not in this changelist)
* Shadow depth rendering was not using BindClearMRTValues to populate GNM parallel commandlist TargetsNeedingEliminateFastClear values
Change 3411873 on 2017/04/27 by Daniel.Wright
[Copy] Deferred uniform expression caching. Setting multiple parameters on a material only causes its uniform expressions to be recached once.
* 280 calls to CacheUniformExpressions -> 120 during Fortnite combat (6.5ms -> 3.4ms)
Change 3411891 on 2017/04/27 by Daniel.Wright
[Copy] GatherShadowPrimitives optimizations
* Total GatherShadowPrimivies went from 2.3ms -> 1.3ms on PS4 with these changes in GPUPerfTest (duplicated 3x)
* Much flatter primitive octree (16 -> 256 max primitives)
* Primitives are culled against the shadow frustum before FPrimitiveSceneInfo or FPrimitiveSceneProxy are dereferenced in FilterPrimitiveForShadows
* FilterPrimitiveForShadows work is done in a ParallelFor. Primitive octree nodes are processed in different jobs.
* StaticMeshWholeSceneShadowBatchVisibility now only stores entries for meshes with bRequiresPerElementVisibility (landscape). Previously it was allocating and zeroing 500Kb 3x per frame (main view + 2 cascades) which cost ~.8ms on PS4.
Change 3412192 on 2017/04/27 by Michael.Trepka
Fixed Clang compile errors in FortniteGame, partial copy of CL 3313426
Change 3412547 on 2017/04/27 by Daniel.Wright
Fixed leak of FShadowMapAllocation and FLightMapAllocation's found by licensee
Change 3414239 on 2017/04/28 by Arne.Schober
DR - UE-44500 - Removed use of Structured Buffer from MorphTargets due to HLSLCC not supporting it.
#RB none
#jira UE-44500
Change 3414754 on 2017/04/28 by Daniel.Wright
Added VolumetricFogEmissive to ExponentialHeightFogComponent
* Volumetric fog does not yet support precomputed lighting, so this is the only way to get an ambient lighting term
Change 3416859 on 2017/05/01 by Arne.Schober
DR - Remove FeatureLevel from the Clear Functions to reduce area of error
#RB Rolando.Caloca
Change 3420750 on 2017/05/03 by Arne.Schober
DR - [UE-44497] - Fix several PS4 validation layer issues
#RB Marcus.Wassmer
Change 3422869 on 2017/05/04 by Benjamin.Hyder
Fix compile error from merge.
Change 3423938 on 2017/05/04 by Marc.Olano
[UE-44453] Fix bloom problems by moving saturate after vector math
Change 3424494 on 2017/05/04 by Olaf.Piesche
#jira UE-44589
When using FindTCosoleVariableData, the CVar can not be an FAutoConsoleVariable.
#tests as described in jira ticket
Change 3424754 on 2017/05/04 by Uriel.Doyon
Fixed call to get texture compressor module outside the main thread.
#jira UE-42168
Change 3425447 on 2017/05/05 by Uriel.Doyon
#buildfix
Change 3427042 on 2017/05/05 by Arne.Schober
DR - Fix one of my typos
#RB none
Change 3428119 on 2017/05/08 by Marcus.Wassmer
Fix UE-44733
static analysis warning.
Change 3428222 on 2017/05/08 by Uriel.Doyon
Fixed bad condition in translucency rendering
#jira UE-44452
Change 3429794 on 2017/05/08 by Uriel.Doyon
Fixed issues with lightshafts and low res translucency.
#jira UE-44452
Change 3430921 on 2017/05/09 by Rolando.Caloca
DR - Get additional function pointers for D3DReflect, Compile and Disassemble instructions from the same DLL when compiling D3D11 shaders.
- Also fixes using the correct fxc.exe path to match the DLL we distribute.
Change 3431156 on 2017/05/09 by Rolando.Caloca
DR - Remove unused code
Change 3431396 on 2017/05/09 by David.Hill
Copy of changes made directly in 4.16 ( CL 341037 )
to be submitted to dev-rendering
#jira UE-44641
Change 3431400 on 2017/05/09 by Rolando.Caloca
DR - Fix typo
Change 3431527 on 2017/05/09 by David.Hill
#rb: none
Oops.
comment out r.ShaderDevelopmentMode =1
Change 3431590 on 2017/05/09 by Daniel.Wright
Removed early return landmine in USceneCaptureComponent2D::Serialize
Change 3431591 on 2017/05/09 by Daniel.Wright
Disallow map building while in PIE, or PIE while buildling lighting
Change 3431594 on 2017/05/09 by Daniel.Wright
Added RenderTargetFormat to UTextureRenderTarget2D, with choices of 8 bit, 16fp, 32fp and 1, 2 or 4 channels.
Change 3431667 on 2017/05/09 by Daniel.Wright
Volumetric fog now supersamples lighting when the history is not available, reducing noise on areas that just came on-screen or after a camera cut.
* The number of samples is controlled by r.VolumetricFog.HistoryMissSupersampleCount, defaults to 4, cinematic scalability uses 16
* Under fast camera movement, volumetric fog cost went from 1.79ms -> 1.97ms with 4 samples, on a 970GTX
Change 3432366 on 2017/05/10 by Richard.Wallis
Fix for MetalRHI Asserts When Using "Profile GPU" With RHI-Thread/Parallel-Execution. Don't insert events when not in RHIThread or the actual single-threaded-render thread.
#jira UE-36006
Change 3432367 on 2017/05/10 by Richard.Wallis
Fix for Metal ReStartRenderPass assert with profiling. macOS metal asserts when using "profileGPU" even with -norhithread argument set.
Added no action to the allowed render pass restart store actions for the depth buffer avoiding the assert. Interested to know the details if this is not a valid assumption to make - throwing away the depth buffer after a render pass I think would be a common case.
#jira UE-44322
Change 3432409 on 2017/05/10 by Richard.Wallis
Merged across CL 3415890 from Release-4.16 fix for (jira UE-43895)
Fix for deferred store actions getting cleared when we don't have a valid render target.
Change 3432833 on 2017/05/10 by Daniel.Wright
Fixed Ocean compile error
Change 3432874 on 2017/05/10 by Marc.Olano
Improved captions for Noise and VectorNoise material nodes
Change 3432947 on 2017/05/10 by Richard.Wallis
Fix for shared Material Native Shader Libraries Don't Function With Iterative Cooking. Keep latest versions of shader byte code in native shared material packaged build in an intermediate directory than can be reused on a later iterative cook.
- Doesn't handle deletion of the intermediate directory contents. Assumed to be a higher level requirement on non iterative cook flag.
#jira UE-44657
Change 3433484 on 2017/05/10 by Arne.Schober
DR - UE-44393 - Move ShaderPlatform into TShaderMap for extra debuginformation when it fails to find a proper shader. Also log when Gobalshaders are verified and recompiled.
#jira UE-44393
#RB Daniel.Wright
Change 3433515 on 2017/05/10 by Arne.Schober
DR - Fix a bug where recompileshaders changed while compiling causes a crash where the chached local vertex factories are mutated while been used.
#RB Daniel.Wright
Change 3433606 on 2017/05/10 by Daniel.Wright
Fixed static shadowing of volumetric fog and translucency causing shadowing past the lightmass importance volume.
Change 3433619 on 2017/05/10 by Daniel.Wright
Skip recapturing reflection captures when PropagateLightingScenarioChange is being called for a level unload. This leaves stale results in reflection captures around when hiding a level in the editor, but avoids the double recapture that happens when swapping lighting scenarios in game, and the unnecessary reflection capture update when exiting PIE.
Change 3433795 on 2017/05/10 by Arne.Schober
DR - add cmdline to select a GPU vendor when multiple GPUs from differnt Vendors are installed into the same Machine
#RB marcus.Wassmer
Change 3433941 on 2017/05/10 by Daniel.Wright
Cone vs tile bounding sphere intersection tests for Light Grid culling of spotlights, which provides much tighter culling than just View space tile AABB vs light bounding sphere.
* Forward shading BasePass 3.7ms -> 2.4ms in a scene with 24 spotlights on 970GTX
* Volumetric fog 2.87ms -> 2.09ms in the same scene
Change 3435139 on 2017/05/11 by Daniel.Wright
Restored GTextureRenderTarget2DMaxSizeX which is used by Ocean
Change 3435297 on 2017/05/11 by Arne.Schober
DR - Remove manual AlignOf and use C++11 keyword instead
#RB Steve.Robb
Change 3435367 on 2017/05/11 by Daniel.Wright
Circle vertex buffer for slightly tighter voxelization of volumetric fog shadowed lights
* 1.5ms -> 1.38ms on 970 GTX with 24 spotlights
Change 3435522 on 2017/05/11 by Brian.Karis
Dither opacity mask now stacks properly for non parallel polys. Dither is randomized by triangle normal.
Change 3436063 on 2017/05/11 by Daniel.Wright
Disabled CLB_AggressiveBatching for PC d3d12 as it causes flickering artifacts in lighting
Change 3436269 on 2017/05/11 by Uriel.Doyon
Fixed UVChannel data possibly not up-to-date depending on user manips.
Change 3436611 on 2017/05/12 by Simon.Tovey
Improved name and tooltip for static mesh property controlling generation of alias tables for uniform sampling.
Change 3436676 on 2017/05/12 by Simon.Tovey
Fix for fixed bounds being "invalid" unless set via the toolbar option.
Change 3436700 on 2017/05/12 by Simon.Tovey
Crash fix.
Issue found in https://udn.unrealengine.com/questions/355944/crash-in-fdynamicspriteemitterdatagetdynamicmeshel.html
Particle proxies would have stale material resource pointers if the material is changed while the system was invisible.
If the old material is freed during this time, the next time the system renders it will crash.
Change 3437367 on 2017/05/12 by Brian.Karis
Fixed bug with small UV charts not packing.
Change 3437860 on 2017/05/12 by Arne.Schober
DR - Fix alignment compile error in win32 where according to ABI alignment is 4 for int64
#RB none
Change 3437972 on 2017/05/12 by Arne.Schober
DR - Fix alignment compile error in win32 where according to ABI function calls cannot take alingned structures. In all of the cases the copy was completely unnecessary.
#RB none
Change 3437975 on 2017/05/12 by Chris.Bunner
Added calculation for MaterialParamsEx to MeshDecals.usf.
#jira UE-43052
Change 3438109 on 2017/05/12 by Rolando.Caloca
DR - Support for -nomcpp on SCW
Change 3438889 on 2017/05/15 by Chris.Bunner
Nullptr check in a few material uniform expressions.
Change 3439351 on 2017/05/15 by Chris.Bunner
Added tooltip to Power material expression.
Change 3439763 on 2017/05/15 by Daniel.Wright
Apply passed in DistanceBiasSqr to line lights - allows volumetric fog to reduce aliasing on line lights
Change 3439764 on 2017/05/15 by Daniel.Wright
Fixed order of operations with bTreatMaxDepthUnshadowed - manifested as unfiltered static shadow depth lookups
Change 3440722 on 2017/05/16 by Guillaume.Abadie
Exposes Scene capture's FOV to blueprints
Change 3441680 on 2017/05/16 by Uriel.Doyon
Added units to point light intensity, to allow the user to specify the value in candelas or lumens.
New point light actors now configure the intensity in candelas by default.
Replaced viewport exposure settings by an EV100 slider.
Hidding the tone mapper in the show flag now still applies the exposure.
Added a new AutoExposure method called EV100 which allows to specify :
- MinEV100, MaxEV100
- Calibration Constnat
- Exposure Compensation
#jira UE-42783
Change 3441884 on 2017/05/16 by Uriel.Doyon
Fixed StreamingDistanceMultiplier not being applied to the texture streaming data.
Change 3442800 on 2017/05/17 by Gil.Gribb
Fixed botched merge.
Change 3442896 on 2017/05/17 by Gil.Gribb
UE4 - Allowed the possibility of running the RHI "thread" on task threads instead and cleaned up and unified the conditionals involved. By default we still have a dedicated RHI thread because it tested slightly faster.
Change 3443951 on 2017/05/17 by Richard.Wallis
Added Apple override allocator macro - each command encoder type needs it's own allocator queue.
Change 3444787 on 2017/05/17 by Daniel.Wright
Fixed DBuffer decal default normal (used when DBuffer decals enabled, but not decals rendered) not reconstructing zero properly, adding -.008 to WorldNormal which then caused artifacts with forward lighting specular on materials with roughness near 0.
Change 3444882 on 2017/05/17 by Daniel.Wright
Added comment to FClearValueBinding::DefaultNormal8Bit to make the dependency on shader decode clear
Change 3444883 on 2017/05/17 by Brian.Karis
Improved contact shadows
Change 3445048 on 2017/05/17 by Daniel.Wright
Fixed particle lights in forward shading, they were not setting the lighting channel mask properly
Change 3445107 on 2017/05/17 by Michael.Trepka
Changed the order of operations in FMetalStateCache::SetRenderState to work around an issue with some Intel drivers where they would not recalculate the raster state in some edge cases.
#jira UE-43725
Change 3445212 on 2017/05/17 by Uriel.Doyon
Added a -CSV option to ListTextures command
Change 3445947 on 2017/05/18 by Richard.Wallis
Clone of Release-4.16 Stream CL 3437181 and CL 3442450 - fix(s) for black rendering on macOS El Cap with Nvidia GPU. Move sampling of EyeAdaption texture to pixel shader for Mac Metal using shader language version <= 1 only.
Change 3446545 on 2017/05/18 by Chris.Bunner
Removed hardcoded (and unused) MRT write from Decal shaders.
#jira UE-45095
Change 3446568 on 2017/05/18 by Marc.Olano
Sobol and image-based importance sampling C++ functions and blueprint nodes
Change 3446988 on 2017/05/18 by Marc.Olano
Fix build error: missing include
Change 3446990 on 2017/05/18 by Marc.Olano
Cell-indexed Sobol sampling for shaders (in MonteCarlo.usf) and materials (Sobol and TemporalSobol nodes)
Change 3447142 on 2017/05/18 by Rolando.Caloca
DR - RWLock instead of mutex for PSO cache
Change 3447144 on 2017/05/18 by Uriel.Doyon
Moved shading model code to SetGBufferFromShadingModel(). This allows the code to be reused in other shader files.
Change 3447794 on 2017/05/18 by Brian.Karis
Virtual texturing foundation code
Change 3448944 on 2017/05/19 by Arciel.Rekman
Fix non-unity Linux (and Mac, etc) builds.
- Mac fix is tentative, did not try.
Change 3449183 on 2017/05/19 by Marcus.Wassmer
Duplicate fix for reflection captures to happen after sequencer updates.
Change 3449196 on 2017/05/19 by Uriel.Doyon
Handling RCM_MinMax when reading FloatRGBA textures.
This fixes pixel inspector always reading 1 for scene color values greater than one.
Change 3451652 on 2017/05/22 by Rolando.Caloca
DR - Compile fix
#jira UE-45245
Change 3451660 on 2017/05/22 by Chris.Bunner
Additional compile fix.
#jira UE-45245
Change 3451897 on 2017/05/22 by Daniel.Wright
Moved RTDF shadow project back after the base pass, since it samples the GBuffer for subsurface shadowing. Removed r.DFShadowAsyncCompute which was relying on the previous ordering.
Change 3452055 on 2017/05/22 by Rolando.Caloca
DR - Switch compile fix
#jira UE-45265
Change 3452089 on 2017/05/22 by Rolando.Caloca
DR - Compile fix
#jira UE-45246
Change 3452108 on 2017/05/22 by Rolando.Caloca
DR - Compile fix
#jira UE-45246
Change 3452179 on 2017/05/22 by Brian.Karis
Exposed dimensions. Fixed static analysis.
Change 3452734 on 2017/05/22 by Daniel.Wright
When post processing is disabled, TPT_TranslucencyAfterDOF translucency gets forced into the standard translucency pass.
Change 3452770 on 2017/05/22 by Daniel.Wright
Static light source shapes drawn into reflection captures handle SourceLength via scaled sphere
Change 3452861 on 2017/05/22 by Rolando.Caloca
DR - Switch compile fix
Change 3452952 on 2017/05/22 by Brian.Karis
Small VT fixes
Change 3453647 on 2017/05/23 by Richard.Wallis
Fix for tessellation shaders on Mac (Metal v1.2) failing to compile.
#jira UE-45227
Change 3454844 on 2017/05/23 by Uriel.Doyon
Fixed extra X16 on some point lights
#jira UE-45250
Change 3454934 on 2017/05/23 by Chris.Bunner
Backing out changelists 3441680, 3454636 and 3454844 for the sake of integration stability.
Change 3457131 on 2017/05/24 by Arne.Schober
DR - [UE-45317] - Fix Depthbuffer not available for resolve in Forward mode
#jira UE-45317
#RB Chris.Bunner
Change 3457141 on 2017/05/24 by Marc.Olano
Sobol bug fixes
Change 3457953 on 2017/05/24 by Brian.Karis
Fix static analysis
#jira UE-45315
#jira UE-45314
#jira UE-45313
Change 3459064 on 2017/05/25 by Chris.Bunner
Fix for out of bounds material translation crash.
#jira UE-45406
Change 3459700 on 2017/05/25 by Brian.Karis
Revert using sprite index buffer because the vert order is different.
Change 3459847 on 2017/05/25 by Chris.Bunner
Fixing ensure in RenderTestMap.
[CL 3461201 by Chris Bunner in Main branch]
2017-05-26 08:22:50 -04:00
2020-07-06 18:58:26 -04:00
auto ComputeShader = View . ShaderMap - > GetShader < FCullObjectsToClipmapCS > ( ) ;
const FIntVector GroupSize = FComputeShaderUtils : : GetGroupCount ( DistanceFieldSceneData . NumObjectsInBuffer , FCullObjectsToClipmapCS : : GetGroupSize ( ) ) ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3461187)
#lockdown Nick.Penwarden
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3388286 on 2017/04/11 by Chris.Bunner
Fix mips in texture GnmUAV creation.
Change 3388287 on 2017/04/11 by Chris.Bunner
Improved PS/CS code sharing for TemporalAA.
Change 3388291 on 2017/04/11 by Chris.Bunner
HLODs now correctly hide their children in shadow maps.
Propagate bCastFarShadow flag on HLOD generation.
#jira UE-42254
Change 3388448 on 2017/04/11 by Brian.Karis
Better handle divide by zero
Change 3388449 on 2017/04/11 by Brian.Karis
Optimizations to shading model math.
PR #3340: Bug fixes related to shader TODOs (Contributed by vgfx)
Change 3388455 on 2017/04/11 by Uriel.Doyon
Changed Remove for RemoveSwap when clearing dynamic component references
Change 3388612 on 2017/04/11 by Simon.Tourangeau
Support shaders in projects and in plugins
When searching for a shader it will
- First look in Engine/Shaders as usual
- Then in project's Shader folder
- Then in all enabled plugin Shader folders
Project or plugin must be loaded in PostConfigInit phase
Tested in PIE, engine (cooked, packaged)
Change 3388819 on 2017/04/11 by Arne.Schober
DR - Faster MorpthTarget implementation. Changed the previous Gather aproach to a scatter based one. Reaching about 110GB/s on PS4 which is up to 4x faster than the previous implementation. On PC DX11 the impact is lower due to unecessary UAV barriers but still 2x faster on AMD and up to 6x faster on Nvidia Hardware.
#RB Lina.Halper, Rolando.Caloca
Change 3388862 on 2017/04/11 by Guillaume.Abadie
Allows Motion Blur and TAA in scene capture 2d.
Change 3388953 on 2017/04/11 by Uriel.Doyon
Fixed issue where lights from hidden levels where affecting the lighting build, by checking if the light is registered before adding it.
#UE-43220
Change 3389138 on 2017/04/11 by Arne.Schober
DR - Fix crash when opening a Level without Contentbrowser open.
#RB Matt.Kuhlenschmidt
Change 3389400 on 2017/04/11 by Uriel.Doyon
- Renamed FMaterialResource::IsSeparateTranslucencyEnabled() to FMaterialResource::IsTranslucencyAfterDOFEnabled()
- Removed different logic to determine if translucency after DOF was enabled, and centralized it into a single function: FSceneViewFamily::AllowTranslucencyAfterDOF()
- FSceneRenderTargets::FinishRenderingSeparateTranslucency() now only resolves a single view, allowing better Begin/Finish scopes.
- Renamed FSceneRenderTargets::SeparateTranslucencyDepthRT into FSceneRenderTargets::DownsampledTranslucencyDepthRT since this one is only allocated when rendering in downsampled mode.
- Standard translucency is now rendered in the same resolution than translucency after DOF. (downsampled or full resolution)
- Removed RenderTranslucencyParallel and merged it's logic into RenderTranslucency. Renamed DrawAllTranslucencyPasses to RenderViewTranslucency and added a parallel version RenderViewTranslucencyParallel.
- Moved all debug draw logic (VisualizeLPV, ViewMeshElements and SimpleElementCollector) to a common place.
- New option "r.AllowDownsampledStandardTranslucency" to control the downsampling of standard translucency. Affect blend module materials
#jira UE-39505
Change 3389860 on 2017/04/12 by Richard.Wallis
UE-41407 Cable actor does not render correctly in viewport on Mac.
Build the mesh at creation time - call into exisiting mesh create function.
Change 3390933 on 2017/04/12 by Arne.Schober
DR - potential fix for UE-43125 where the this pointer might get invalidated in the middle of the function
#RB Marcus.Wassmer
Change 3391010 on 2017/04/12 by Ben.Marsh
Compile UE4Game non-unity for Mac as part of nightly builds in //UE4/Dev-Rendering.
Change 3391412 on 2017/04/12 by Uriel.Doyon
Mesh Decals are now sorted according to the component TranslucencySortPriority.
#jira UE-43053
Change 3392117 on 2017/04/13 by Guillaume.Abadie
Integrates Raven's experimental PCSS for cascaded shadow map hidden behind a CVar.
Change 3392179 on 2017/04/13 by Guillaume.Abadie
Attempts to fix linux compilation by removing mistakenly submitted dead code.
Change 3392231 on 2017/04/13 by Guillaume.Abadie
Fixes a wrong enum value real quick in FRenderingObjectVersion I introduced after main integration... Oups...
Change 3393879 on 2017/04/14 by Guillaume.Abadie
Attempts to fix linux compilation warning.
Change 3393881 on 2017/04/14 by Guillaume.Abadie
Back out changelist 3393879
Change 3393882 on 2017/04/14 by Guillaume.Abadie
Attempts #2 to fix linux compilation error.
Change 3394100 on 2017/04/14 by Chris.Bunner
Corrected material shared sampler usage with mip-biasing.
Change 3394174 on 2017/04/14 by Rolando.Caloca
DR - Change ensure to warning
Change 3394221 on 2017/04/14 by Marcus.Wassmer
Fix poseable mesh bounds calculation.
Change 3396238 on 2017/04/17 by David.Hill
Fix Bloom with LensFlare
Duplicating fix - will also fix directly in 4.16
#jira 44050
Change 3397055 on 2017/04/17 by Joe.Graf
Fixed Windows specific assumptions in Slate File Dialog Window's file filtering that lead to crashes
#CodeReview: matt.kuhlenschmidt
#rb: n/a
Change 3397921 on 2017/04/18 by Joe.Graf
Rewrote SlateFileDlgWindow's file filtering to allow for extensionless file selection and to remove the O(n^2) file filtering
#CodeReview: arciel.rekman, matt.kuhlenschmidt
#rb: n/a
Change 3398406 on 2017/04/18 by Rolando.Caloca
DR - Fix shaders in plugins on Mac
Change 3399546 on 2017/04/19 by Benjamin.Hyder
Updating content for test levels (HDR, Bloom_FFT, DistanceFields_IndirectShadows)
Change 3399725 on 2017/04/19 by Guillaume.Abadie
Avoids compiling PCSS shaders for SM4.
Change 3400295 on 2017/04/19 by Michael.Trepka
Fixed metal shader compile errors in MorphTargets.usf
Change 3400457 on 2017/04/19 by Michael.Trepka
Merged Rolando's shader fixes
Change 3400473 on 2017/04/19 by Arne.Schober
DR - provide Aftermath Reason when init failed.
#RB none
Change 3400699 on 2017/04/19 by Arne.Schober
DR - Fixed Text macro
#RB none
Change 3402280 on 2017/04/20 by Simon.Tovey
Minor cascade fix
#tests no crash
#jira UE-41560
Change 3402517 on 2017/04/20 by Arne.Schober
DR - Fix static analysis warning
#RB none
Change 3403897 on 2017/04/21 by Arne.Schober
DR - [UE-43898] - Someone missed a shaderversion bump which poisoned the DCC
#RB None
#jira UE-43898
Change 3404591 on 2017/04/21 by Olaf.Piesche
#jira UE-41979
Should never be crashing there, unless the mesh is changed after Init of the effect instance; this change safeguards against the number of mesh sections (and hence materials) changing after creation of the dynamic data to avoid the crash.
Change 3407451 on 2017/04/25 by Daniel.Wright
Fixed Indirect Lighting Cache updates caused by capsule indirect shadows forcing point samples, breaking primitives using ILCQ_Volume
Change 3407452 on 2017/04/25 by Daniel.Wright
Added r.AOJitterConeDirections, although disabled by default because it requires the temporal filter to be much stronger
Change 3408397 on 2017/04/25 by Daniel.Wright
ViewFamily.bRealtimeUpdate is set to false if Slate is throttling (like when toggling show flags). Volumetric fog discards the temporal history when not realtime, so you can see changes immediately.
Change 3408428 on 2017/04/25 by Daniel.Wright
Changed 'r.AOMaxObjectsPerCullTile' default back to 512 as 256 causes artifacts with RTDF shadows
Change 3409764 on 2017/04/26 by Daniel.Wright
Force dumping shader debug info for Global shaders when r.ShaderDevelopmentMode is enabled. Most of the shaders you want to look at in a GPU capture are global shaders, and global shaders create few debug files. 'recompileshaders global' time 35s -> 38s for SM5.
Change 3411659 on 2017/04/27 by Daniel.Wright
[Copy] Set Xbox One engine default screen percentage to 83.33 (1600x900), as ESRAM choices are dependent on this
Change 3411660 on 2017/04/27 by Daniel.Wright
[Copy] Global distance field composite shader has a version for each flattened axis, which improves efficiency when updating a slab which is what camera movement typically causes
Change 3411667 on 2017/04/27 by Daniel.Wright
[Copy] Discard distance field AO history buffer if it doesn't match the new buffer size. This prevents reading uinitialized data after a scene render target resize.
Change 3411668 on 2017/04/27 by Daniel.Wright
[Copy] Better indirect capsule shadow draw event info
Change 3411669 on 2017/04/27 by Daniel.Wright
[Copy] Pass down FeatureLevel to AddSubjectPrimitive and GatherShadowsForPrimitiveInner instead of calling the scene's virtual function. Showed up prominently in a sampling profile.
Change 3411755 on 2017/04/27 by Daniel.Wright
[Copy] Occlusion queries are now always done before the base pass if a nearly full prepass is being used(DDM_AllOccluders or DDM_AllOpaque)
* Removed r.OcclusionQueryLocation
Change 3411827 on 2017/04/27 by Daniel.Wright
[Copy] Much cheaper implementation of IsForwardShadingEnabled which showed up prominently in sampling profiles - inlined function and no more unnecessary thread safety overhead
Change 3411829 on 2017/04/27 by Daniel.Wright
Added an ensure to console manager when doing FindTConsoleVariableData* on a FAutoConsoleVariableRef
Change 3411837 on 2017/04/27 by Daniel.Wright
[Copy] Worked around slow memcpy's being used to sort FSortedLightSceneInfo
Change 3411838 on 2017/04/27 by Daniel.Wright
[Copy] Skip tracking MaterialRenderProxyMap on cooked platforms
Change 3411843 on 2017/04/27 by Daniel.Wright
[Copy] Fixed r.ParallelShadows on PS4 and enabled by default engine-wide (saves 5ms RT with CSM)
* Gnm was not tracking DepthClearValue when a depth target was set but not cleared
* Gnm has a bug where TargetsNeedingEliminateFastClear does not persist across commandlist breaks. Moved FinishRenderingGBuffer before RenderShadowDepthMaps to workaround (accidentally not in this changelist)
* Shadow depth rendering was not using BindClearMRTValues to populate GNM parallel commandlist TargetsNeedingEliminateFastClear values
Change 3411873 on 2017/04/27 by Daniel.Wright
[Copy] Deferred uniform expression caching. Setting multiple parameters on a material only causes its uniform expressions to be recached once.
* 280 calls to CacheUniformExpressions -> 120 during Fortnite combat (6.5ms -> 3.4ms)
Change 3411891 on 2017/04/27 by Daniel.Wright
[Copy] GatherShadowPrimitives optimizations
* Total GatherShadowPrimivies went from 2.3ms -> 1.3ms on PS4 with these changes in GPUPerfTest (duplicated 3x)
* Much flatter primitive octree (16 -> 256 max primitives)
* Primitives are culled against the shadow frustum before FPrimitiveSceneInfo or FPrimitiveSceneProxy are dereferenced in FilterPrimitiveForShadows
* FilterPrimitiveForShadows work is done in a ParallelFor. Primitive octree nodes are processed in different jobs.
* StaticMeshWholeSceneShadowBatchVisibility now only stores entries for meshes with bRequiresPerElementVisibility (landscape). Previously it was allocating and zeroing 500Kb 3x per frame (main view + 2 cascades) which cost ~.8ms on PS4.
Change 3412192 on 2017/04/27 by Michael.Trepka
Fixed Clang compile errors in FortniteGame, partial copy of CL 3313426
Change 3412547 on 2017/04/27 by Daniel.Wright
Fixed leak of FShadowMapAllocation and FLightMapAllocation's found by licensee
Change 3414239 on 2017/04/28 by Arne.Schober
DR - UE-44500 - Removed use of Structured Buffer from MorphTargets due to HLSLCC not supporting it.
#RB none
#jira UE-44500
Change 3414754 on 2017/04/28 by Daniel.Wright
Added VolumetricFogEmissive to ExponentialHeightFogComponent
* Volumetric fog does not yet support precomputed lighting, so this is the only way to get an ambient lighting term
Change 3416859 on 2017/05/01 by Arne.Schober
DR - Remove FeatureLevel from the Clear Functions to reduce area of error
#RB Rolando.Caloca
Change 3420750 on 2017/05/03 by Arne.Schober
DR - [UE-44497] - Fix several PS4 validation layer issues
#RB Marcus.Wassmer
Change 3422869 on 2017/05/04 by Benjamin.Hyder
Fix compile error from merge.
Change 3423938 on 2017/05/04 by Marc.Olano
[UE-44453] Fix bloom problems by moving saturate after vector math
Change 3424494 on 2017/05/04 by Olaf.Piesche
#jira UE-44589
When using FindTCosoleVariableData, the CVar can not be an FAutoConsoleVariable.
#tests as described in jira ticket
Change 3424754 on 2017/05/04 by Uriel.Doyon
Fixed call to get texture compressor module outside the main thread.
#jira UE-42168
Change 3425447 on 2017/05/05 by Uriel.Doyon
#buildfix
Change 3427042 on 2017/05/05 by Arne.Schober
DR - Fix one of my typos
#RB none
Change 3428119 on 2017/05/08 by Marcus.Wassmer
Fix UE-44733
static analysis warning.
Change 3428222 on 2017/05/08 by Uriel.Doyon
Fixed bad condition in translucency rendering
#jira UE-44452
Change 3429794 on 2017/05/08 by Uriel.Doyon
Fixed issues with lightshafts and low res translucency.
#jira UE-44452
Change 3430921 on 2017/05/09 by Rolando.Caloca
DR - Get additional function pointers for D3DReflect, Compile and Disassemble instructions from the same DLL when compiling D3D11 shaders.
- Also fixes using the correct fxc.exe path to match the DLL we distribute.
Change 3431156 on 2017/05/09 by Rolando.Caloca
DR - Remove unused code
Change 3431396 on 2017/05/09 by David.Hill
Copy of changes made directly in 4.16 ( CL 341037 )
to be submitted to dev-rendering
#jira UE-44641
Change 3431400 on 2017/05/09 by Rolando.Caloca
DR - Fix typo
Change 3431527 on 2017/05/09 by David.Hill
#rb: none
Oops.
comment out r.ShaderDevelopmentMode =1
Change 3431590 on 2017/05/09 by Daniel.Wright
Removed early return landmine in USceneCaptureComponent2D::Serialize
Change 3431591 on 2017/05/09 by Daniel.Wright
Disallow map building while in PIE, or PIE while buildling lighting
Change 3431594 on 2017/05/09 by Daniel.Wright
Added RenderTargetFormat to UTextureRenderTarget2D, with choices of 8 bit, 16fp, 32fp and 1, 2 or 4 channels.
Change 3431667 on 2017/05/09 by Daniel.Wright
Volumetric fog now supersamples lighting when the history is not available, reducing noise on areas that just came on-screen or after a camera cut.
* The number of samples is controlled by r.VolumetricFog.HistoryMissSupersampleCount, defaults to 4, cinematic scalability uses 16
* Under fast camera movement, volumetric fog cost went from 1.79ms -> 1.97ms with 4 samples, on a 970GTX
Change 3432366 on 2017/05/10 by Richard.Wallis
Fix for MetalRHI Asserts When Using "Profile GPU" With RHI-Thread/Parallel-Execution. Don't insert events when not in RHIThread or the actual single-threaded-render thread.
#jira UE-36006
Change 3432367 on 2017/05/10 by Richard.Wallis
Fix for Metal ReStartRenderPass assert with profiling. macOS metal asserts when using "profileGPU" even with -norhithread argument set.
Added no action to the allowed render pass restart store actions for the depth buffer avoiding the assert. Interested to know the details if this is not a valid assumption to make - throwing away the depth buffer after a render pass I think would be a common case.
#jira UE-44322
Change 3432409 on 2017/05/10 by Richard.Wallis
Merged across CL 3415890 from Release-4.16 fix for (jira UE-43895)
Fix for deferred store actions getting cleared when we don't have a valid render target.
Change 3432833 on 2017/05/10 by Daniel.Wright
Fixed Ocean compile error
Change 3432874 on 2017/05/10 by Marc.Olano
Improved captions for Noise and VectorNoise material nodes
Change 3432947 on 2017/05/10 by Richard.Wallis
Fix for shared Material Native Shader Libraries Don't Function With Iterative Cooking. Keep latest versions of shader byte code in native shared material packaged build in an intermediate directory than can be reused on a later iterative cook.
- Doesn't handle deletion of the intermediate directory contents. Assumed to be a higher level requirement on non iterative cook flag.
#jira UE-44657
Change 3433484 on 2017/05/10 by Arne.Schober
DR - UE-44393 - Move ShaderPlatform into TShaderMap for extra debuginformation when it fails to find a proper shader. Also log when Gobalshaders are verified and recompiled.
#jira UE-44393
#RB Daniel.Wright
Change 3433515 on 2017/05/10 by Arne.Schober
DR - Fix a bug where recompileshaders changed while compiling causes a crash where the chached local vertex factories are mutated while been used.
#RB Daniel.Wright
Change 3433606 on 2017/05/10 by Daniel.Wright
Fixed static shadowing of volumetric fog and translucency causing shadowing past the lightmass importance volume.
Change 3433619 on 2017/05/10 by Daniel.Wright
Skip recapturing reflection captures when PropagateLightingScenarioChange is being called for a level unload. This leaves stale results in reflection captures around when hiding a level in the editor, but avoids the double recapture that happens when swapping lighting scenarios in game, and the unnecessary reflection capture update when exiting PIE.
Change 3433795 on 2017/05/10 by Arne.Schober
DR - add cmdline to select a GPU vendor when multiple GPUs from differnt Vendors are installed into the same Machine
#RB marcus.Wassmer
Change 3433941 on 2017/05/10 by Daniel.Wright
Cone vs tile bounding sphere intersection tests for Light Grid culling of spotlights, which provides much tighter culling than just View space tile AABB vs light bounding sphere.
* Forward shading BasePass 3.7ms -> 2.4ms in a scene with 24 spotlights on 970GTX
* Volumetric fog 2.87ms -> 2.09ms in the same scene
Change 3435139 on 2017/05/11 by Daniel.Wright
Restored GTextureRenderTarget2DMaxSizeX which is used by Ocean
Change 3435297 on 2017/05/11 by Arne.Schober
DR - Remove manual AlignOf and use C++11 keyword instead
#RB Steve.Robb
Change 3435367 on 2017/05/11 by Daniel.Wright
Circle vertex buffer for slightly tighter voxelization of volumetric fog shadowed lights
* 1.5ms -> 1.38ms on 970 GTX with 24 spotlights
Change 3435522 on 2017/05/11 by Brian.Karis
Dither opacity mask now stacks properly for non parallel polys. Dither is randomized by triangle normal.
Change 3436063 on 2017/05/11 by Daniel.Wright
Disabled CLB_AggressiveBatching for PC d3d12 as it causes flickering artifacts in lighting
Change 3436269 on 2017/05/11 by Uriel.Doyon
Fixed UVChannel data possibly not up-to-date depending on user manips.
Change 3436611 on 2017/05/12 by Simon.Tovey
Improved name and tooltip for static mesh property controlling generation of alias tables for uniform sampling.
Change 3436676 on 2017/05/12 by Simon.Tovey
Fix for fixed bounds being "invalid" unless set via the toolbar option.
Change 3436700 on 2017/05/12 by Simon.Tovey
Crash fix.
Issue found in https://udn.unrealengine.com/questions/355944/crash-in-fdynamicspriteemitterdatagetdynamicmeshel.html
Particle proxies would have stale material resource pointers if the material is changed while the system was invisible.
If the old material is freed during this time, the next time the system renders it will crash.
Change 3437367 on 2017/05/12 by Brian.Karis
Fixed bug with small UV charts not packing.
Change 3437860 on 2017/05/12 by Arne.Schober
DR - Fix alignment compile error in win32 where according to ABI alignment is 4 for int64
#RB none
Change 3437972 on 2017/05/12 by Arne.Schober
DR - Fix alignment compile error in win32 where according to ABI function calls cannot take alingned structures. In all of the cases the copy was completely unnecessary.
#RB none
Change 3437975 on 2017/05/12 by Chris.Bunner
Added calculation for MaterialParamsEx to MeshDecals.usf.
#jira UE-43052
Change 3438109 on 2017/05/12 by Rolando.Caloca
DR - Support for -nomcpp on SCW
Change 3438889 on 2017/05/15 by Chris.Bunner
Nullptr check in a few material uniform expressions.
Change 3439351 on 2017/05/15 by Chris.Bunner
Added tooltip to Power material expression.
Change 3439763 on 2017/05/15 by Daniel.Wright
Apply passed in DistanceBiasSqr to line lights - allows volumetric fog to reduce aliasing on line lights
Change 3439764 on 2017/05/15 by Daniel.Wright
Fixed order of operations with bTreatMaxDepthUnshadowed - manifested as unfiltered static shadow depth lookups
Change 3440722 on 2017/05/16 by Guillaume.Abadie
Exposes Scene capture's FOV to blueprints
Change 3441680 on 2017/05/16 by Uriel.Doyon
Added units to point light intensity, to allow the user to specify the value in candelas or lumens.
New point light actors now configure the intensity in candelas by default.
Replaced viewport exposure settings by an EV100 slider.
Hidding the tone mapper in the show flag now still applies the exposure.
Added a new AutoExposure method called EV100 which allows to specify :
- MinEV100, MaxEV100
- Calibration Constnat
- Exposure Compensation
#jira UE-42783
Change 3441884 on 2017/05/16 by Uriel.Doyon
Fixed StreamingDistanceMultiplier not being applied to the texture streaming data.
Change 3442800 on 2017/05/17 by Gil.Gribb
Fixed botched merge.
Change 3442896 on 2017/05/17 by Gil.Gribb
UE4 - Allowed the possibility of running the RHI "thread" on task threads instead and cleaned up and unified the conditionals involved. By default we still have a dedicated RHI thread because it tested slightly faster.
Change 3443951 on 2017/05/17 by Richard.Wallis
Added Apple override allocator macro - each command encoder type needs it's own allocator queue.
Change 3444787 on 2017/05/17 by Daniel.Wright
Fixed DBuffer decal default normal (used when DBuffer decals enabled, but not decals rendered) not reconstructing zero properly, adding -.008 to WorldNormal which then caused artifacts with forward lighting specular on materials with roughness near 0.
Change 3444882 on 2017/05/17 by Daniel.Wright
Added comment to FClearValueBinding::DefaultNormal8Bit to make the dependency on shader decode clear
Change 3444883 on 2017/05/17 by Brian.Karis
Improved contact shadows
Change 3445048 on 2017/05/17 by Daniel.Wright
Fixed particle lights in forward shading, they were not setting the lighting channel mask properly
Change 3445107 on 2017/05/17 by Michael.Trepka
Changed the order of operations in FMetalStateCache::SetRenderState to work around an issue with some Intel drivers where they would not recalculate the raster state in some edge cases.
#jira UE-43725
Change 3445212 on 2017/05/17 by Uriel.Doyon
Added a -CSV option to ListTextures command
Change 3445947 on 2017/05/18 by Richard.Wallis
Clone of Release-4.16 Stream CL 3437181 and CL 3442450 - fix(s) for black rendering on macOS El Cap with Nvidia GPU. Move sampling of EyeAdaption texture to pixel shader for Mac Metal using shader language version <= 1 only.
Change 3446545 on 2017/05/18 by Chris.Bunner
Removed hardcoded (and unused) MRT write from Decal shaders.
#jira UE-45095
Change 3446568 on 2017/05/18 by Marc.Olano
Sobol and image-based importance sampling C++ functions and blueprint nodes
Change 3446988 on 2017/05/18 by Marc.Olano
Fix build error: missing include
Change 3446990 on 2017/05/18 by Marc.Olano
Cell-indexed Sobol sampling for shaders (in MonteCarlo.usf) and materials (Sobol and TemporalSobol nodes)
Change 3447142 on 2017/05/18 by Rolando.Caloca
DR - RWLock instead of mutex for PSO cache
Change 3447144 on 2017/05/18 by Uriel.Doyon
Moved shading model code to SetGBufferFromShadingModel(). This allows the code to be reused in other shader files.
Change 3447794 on 2017/05/18 by Brian.Karis
Virtual texturing foundation code
Change 3448944 on 2017/05/19 by Arciel.Rekman
Fix non-unity Linux (and Mac, etc) builds.
- Mac fix is tentative, did not try.
Change 3449183 on 2017/05/19 by Marcus.Wassmer
Duplicate fix for reflection captures to happen after sequencer updates.
Change 3449196 on 2017/05/19 by Uriel.Doyon
Handling RCM_MinMax when reading FloatRGBA textures.
This fixes pixel inspector always reading 1 for scene color values greater than one.
Change 3451652 on 2017/05/22 by Rolando.Caloca
DR - Compile fix
#jira UE-45245
Change 3451660 on 2017/05/22 by Chris.Bunner
Additional compile fix.
#jira UE-45245
Change 3451897 on 2017/05/22 by Daniel.Wright
Moved RTDF shadow project back after the base pass, since it samples the GBuffer for subsurface shadowing. Removed r.DFShadowAsyncCompute which was relying on the previous ordering.
Change 3452055 on 2017/05/22 by Rolando.Caloca
DR - Switch compile fix
#jira UE-45265
Change 3452089 on 2017/05/22 by Rolando.Caloca
DR - Compile fix
#jira UE-45246
Change 3452108 on 2017/05/22 by Rolando.Caloca
DR - Compile fix
#jira UE-45246
Change 3452179 on 2017/05/22 by Brian.Karis
Exposed dimensions. Fixed static analysis.
Change 3452734 on 2017/05/22 by Daniel.Wright
When post processing is disabled, TPT_TranslucencyAfterDOF translucency gets forced into the standard translucency pass.
Change 3452770 on 2017/05/22 by Daniel.Wright
Static light source shapes drawn into reflection captures handle SourceLength via scaled sphere
Change 3452861 on 2017/05/22 by Rolando.Caloca
DR - Switch compile fix
Change 3452952 on 2017/05/22 by Brian.Karis
Small VT fixes
Change 3453647 on 2017/05/23 by Richard.Wallis
Fix for tessellation shaders on Mac (Metal v1.2) failing to compile.
#jira UE-45227
Change 3454844 on 2017/05/23 by Uriel.Doyon
Fixed extra X16 on some point lights
#jira UE-45250
Change 3454934 on 2017/05/23 by Chris.Bunner
Backing out changelists 3441680, 3454636 and 3454844 for the sake of integration stability.
Change 3457131 on 2017/05/24 by Arne.Schober
DR - [UE-45317] - Fix Depthbuffer not available for resolve in Forward mode
#jira UE-45317
#RB Chris.Bunner
Change 3457141 on 2017/05/24 by Marc.Olano
Sobol bug fixes
Change 3457953 on 2017/05/24 by Brian.Karis
Fix static analysis
#jira UE-45315
#jira UE-45314
#jira UE-45313
Change 3459064 on 2017/05/25 by Chris.Bunner
Fix for out of bounds material translation crash.
#jira UE-45406
Change 3459700 on 2017/05/25 by Brian.Karis
Revert using sprite index buffer because the vert order is different.
Change 3459847 on 2017/05/25 by Chris.Bunner
Fixing ensure in RenderTestMap.
[CL 3461201 by Chris Bunner in Main branch]
2017-05-26 08:22:50 -04:00
2020-07-06 18:58:26 -04:00
FComputeShaderUtils : : AddPass (
GraphBuilder ,
RDG_EVENT_NAME ( " CullToClipmap " ) ,
ComputeShader ,
PassParameters ,
GroupSize ) ;
}
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3185985)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3170391 on 2016/10/21 by Ben.Woodhouse
Remove the wait on end of frame ensure, because we can't rely on all the the underlying codepaths to never miss a call to flush RHI resources. The consequences of missing a flush on a given frame are not serious now, since we enforce the synchronisation with a fence, preventing the rendering thread from getting too far ahead. We will simply accumulate resources for an additional frame when this happens.
#jira UE-37437
#fyi rolando.caloca, marcus.wassmer
Change 3170659 on 2016/10/21 by Rolando.Caloca
DR - vk - Prep work for state key changes
Change 3170676 on 2016/10/21 by Rolando.Caloca
DR - vk - Reworked blend state keys
- Added depth/stencil to pipeline key
Change 3170848 on 2016/10/21 by Daniel.Wright
Level viewport 'show stats' option is now enabled by default, which avoids confusion with artists thinking lighting is built, when really the message is hidden.
Change 3170849 on 2016/10/21 by Daniel.Wright
Split FProjectedShadowInfo::RenderProjection into smaller functions which make the algorithm structure clear
Change 3170995 on 2016/10/21 by Rolando.Caloca
DR - vk - Show object on vulkan validation msgs
Change 3171085 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix pipelines being used with incompatible renderpasses
Change 3171159 on 2016/10/21 by Rolando.Caloca
DR - vk - Fix layout when reading textures on CPU
Change 3171167 on 2016/10/21 by Rolando.Caloca
DR - vk - compile fix
Change 3172462 on 2016/10/24 by Daniel.Wright
Added a warning about shader compile times to the material tooltip
Change 3172463 on 2016/10/24 by Daniel.Wright
Reduced MinUnoccludedFraction to avoid artitfacts when a stationary light touches only a tiny part of a mesh
Change 3172716 on 2016/10/24 by Brian.Karis
Fix for crash UE-37369 when reimporting over a generated LOD.
Change 3172967 on 2016/10/24 by Rolando.Caloca
DR - vk - Fix writing buffers while GPU was using them
Change 3174187 on 2016/10/25 by Olaf.Piesche
UE-37020
Change 3174718 on 2016/10/26 by Rolando.Caloca
DR - vk - Remove old timestamp queries, increase occlusion queries per pool to 4k
Change 3175960 on 2016/10/26 by Rolando.Caloca
DR - Added support for hlslcc header to have custom parsing
Change 3176611 on 2016/10/27 by David.Hill
DrawWireCone confusion:
In response to a UDN, I'm updating confusing parameter names and comments for
DrawWireCone() and DrawWireSphereCappedCone()
Change 3177111 on 2016/10/27 by Rolando.Caloca
DR - vk - Fix timestamps for frame
Change 3177192 on 2016/10/27 by Arne.Schober
DR - DitherLOD refactor - moved computation of the DepthStencil state out of SetMeshRenderState into GetDitheredLODTransitionState this is a prerequisite of further PSO work where we want to move up State setting in a similar war and reuse FMeshDrawingRenderState
Change 3177278 on 2016/10/27 by Olaf.Piesche
UE-37484
Change 3177297 on 2016/10/27 by Rolando.Caloca
DR - vk - Enable GRHISupportsBaseVertexIndex
Change 3177607 on 2016/10/27 by Rolando.Caloca
DR - vk - SM4 UB prep
Change 3178052 on 2016/10/28 by Arne.Schober
DR - fix WebGL - the WebGL compiler is very picky on double underscores and does want the presission to be defined before any function definition.
Change 3178156 on 2016/10/28 by Rolando.Caloca
DR - vk - Added query timer
- Fixed inline issues
Change 3178158 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for out of stencil bits
Change 3178462 on 2016/10/28 by Rolando.Caloca
DR - vk - Fixes for Elemental
Change 3179131 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix for r.Vulkan.UseRealUBs
Change 3179139 on 2016/10/28 by Rolando.Caloca
DR - vk - Move UB ring buffer to context
Change 3179145 on 2016/10/28 by Rolando.Caloca
DR - vk - Fix buffer barriers
Change 3179888 on 2016/10/31 by Rolando.Caloca
DR - vk - Align buffers to 16 bytes as we sometimes write to them with SIMD
Change 3179923 on 2016/10/31 by Rolando.Caloca
DR - vk - Wait for swapchain counter
Change 3180430 on 2016/10/31 by Rolando.Caloca
DR - vk - Properly wait for occlusion queries/cmd buffer
- Actual log error if trying to use occlusion queries out of order
Change 3180746 on 2016/10/31 by Rolando.Caloca
DR - vk - Undo some waiting as it was on the wrong thread
Change 3182115 on 2016/11/01 by Rolando.Caloca
DR - hlslcc Linux path fix
Change 3182118 on 2016/11/01 by Daniel.Wright
Fixed global distance field seam artifacts from landscapes with no subsections
Change 3182368 on 2016/11/01 by Daniel.Wright
Dynamic Indirect Shadows for static meshes using distance fields
* These Distance Field indirect shadows use the same tile culled and downsampled framework that capsule shadows use, with similar GPU cost
* Individual StaticMesh assets can enable bGenerateMeshDistanceField to compute a distance field, without the memory cost of enabling for the whole project
* New StaticMeshComponent properties bCastDynamicIndirectShadow and DynamicIndirectShadowMinVisibility
* New WorldSettings property DynamicIndirectShadowsSelfShadowingIntensity which replaces the cvar
* The GBuffer now stores HasDynamicIndirectShadowCasterRepresentation instead of HasHeightfieldRepresentation
* DFAO from landscape is now done through the global distance field entirely. Landscape contribution to the global distance field is deferred to attempt to workaround texture streaming issues.
Change 3182408 on 2016/11/01 by Rolando.Caloca
DR - vk - Reworked occlusion queries, fixes flickering on AMD
Change 3182585 on 2016/11/01 by Daniel.Wright
PS4 compile fix
Change 3183151 on 2016/11/02 by Rolando.Caloca
DR - vk - Fix issue when processing super quick cmd buffers
Change 3183160 on 2016/11/02 by Rolando.Caloca
Dr - vk - Call reset queries outside render pass
Change 3183182 on 2016/11/02 by Rolando.Caloca
DR - Switch clear
Change 3183194 on 2016/11/02 by Rolando.Caloca
DR - Try to catch crash ahead of time
Change 3183268 on 2016/11/02 by Rolando.Caloca
DR - vk - Rename RenderPassState to TransitionState
Change 3183440 on 2016/11/02 by Daniel.Wright
Renamed 'Dynamic Indirect Shadow' to 'Distance Field Indirect Shadow'
Change 3183793 on 2016/11/02 by Daniel.Wright
Added ShadowResolutionScale to lightcomponent
Change 3183796 on 2016/11/02 by Daniel.Wright
Improved bSimulatePhysics comment, with info on why it might be greyed out
Change 3183797 on 2016/11/02 by Daniel.Wright
Precomputed shadowmaps no longer enable Force2To1Aspect, which is only needed for lightmaps. Improves shadowmap utilization.
Change 3183915 on 2016/11/02 by Rolando.Caloca
DR - vk - Remove redundant renderpasses
Change 3183991 on 2016/11/02 by Daniel.Wright
Added r.ReflectionEnvironmentLightmapMixLargestWeight, useful for restricting lightmap mixing to darkening only
Change 3184001 on 2016/11/02 by Daniel.Wright
Better draw event for IndirectCapsuleShadows in stereo
Change 3184096 on 2016/11/02 by Chris.Bunner
HDR for D3D11 - NVAPI toggle and encoding, UI compositing.
Removed some outdated tonemamping cvars and modes.
Change 3184399 on 2016/11/02 by Daniel.Wright
Static analysis workaround
Change 3184455 on 2016/11/02 by Mark.Satterthwaite
Fix missing log10 from FCompositePS on hlslcc shader platforms so that QA can continue their integration.
#jira UE-38164
Change 3184953 on 2016/11/03 by Chris.Bunner
Fixing CIS warnings.
[CL 3186011 by Marcus Wassmer in Main branch]
2016-11-03 16:55:27 -04:00
2022-01-26 17:07:27 -05:00
const uint32 GGlobalDistanceFieldMaxPageNum = GlobalDistanceField : : GetMaxPageNum ( bLumenEnabled , View . FinalPostProcessSettings . LumenSceneViewDistance ) ;
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304)
#lockdown Nick.Penwarden
#rb none
==========================
MAJOR FEATURES + CHANGES
==========================
Change 3250856 on 2017/01/09 by Daniel.Wright
Only showing instruction count for 'Base pass shader' now
Change 3250943 on 2017/01/09 by Rolando.Caloca
DR - Async Compute PSO creation
Change 3251036 on 2017/01/09 by Rolando.Caloca
DR - Add r.AsyncPipelineCompile
- Dispatch on any thread
- Wait for completion event
Change 3251058 on 2017/01/09 by Ben.Woodhouse
Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN)
#jira UE-40332
Change 3251141 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite CL 3243458:
D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory.
On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage.
Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX
Change 3251142 on 2017/01/09 by Ben.Woodhouse
Duplicated from Fortnite 3243496
memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time.
Change 3252323 on 2017/01/10 by Rolando.Caloca
DR - Gfx async PSO creation prep
Change 3252474 on 2017/01/10 by Daniel.Wright
Added 'Compile Unreal Lightmass' to error message
Change 3252589 on 2017/01/10 by Daniel.Wright
Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite
Change 3252790 on 2017/01/10 by Daniel.Wright
Added InscatteringColorCubemapAngle to exponential height fog
Change 3252843 on 2017/01/10 by Uriel.Doyon
Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways.
The bound defrag is now done outside of the async work logic.
Change 3252866 on 2017/01/10 by Mark.Satterthwaite
Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future.
#jira UE-40357
Change 3254511 on 2017/01/11 by Rolando.Caloca
DR - PSO stats
Change 3255958 on 2017/01/12 by Mark.Satterthwaite
Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed.
#jira UE-40554
Change 3256329 on 2017/01/12 by Olaf.Piesche
#jira UE-38615
Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime.
Change 3256371 on 2017/01/12 by Uriel.Doyon
Reenabled texture streaming bound defrag as the fix is in CL 3252843
Change 3257032 on 2017/01/13 by Daniel.Wright
Added fastClamp to fastmath.usf
Change 3257111 on 2017/01/13 by Daniel.Wright
Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game
Change 3257112 on 2017/01/13 by Daniel.Wright
DFAO optimizations
* Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms)
* Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms)
* Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms
* Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms)
Change 3257113 on 2017/01/13 by Daniel.Wright
Better distance field memory stats
Change 3257326 on 2017/01/13 by Uriel.Doyon
Workaround to support cases where several textures have the same lighting GUID.
Change 3257448 on 2017/01/13 by Daniel.Wright
Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles
Change 3257616 on 2017/01/13 by Daniel.Wright
Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable
Change 3257657 on 2017/01/13 by Daniel.Wright
Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU
* 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms
Change 3258063 on 2017/01/14 by Rolando.Caloca
DR - vk - Refactor descriptor set reuse in prep for more changes
Change 3258715 on 2017/01/16 by Daniel.Wright
Added VisualizeGlobalDistanceField show flag
Change 3258827 on 2017/01/16 by Daniel.Wright
Global distance field update regions are clipped against others to reduce redundant updates.
Change 3258959 on 2017/01/16 by Benjamin.Hyder
Updating Planar Reflection example material in TM-Shadermodels
Change 3259270 on 2017/01/16 by Daniel.Wright
[Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons.
Change 3259652 on 2017/01/16 by Uriel.Doyon
Better support for static primitive becoming dynamic.
Change 3260107 on 2017/01/17 by Ben.Woodhouse
Fix FMonitoredProcess to prevent infinite loop in -nothreading mode
#jira UE-40717
Change 3260594 on 2017/01/17 by Daniel.Wright
Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary)
* The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field
* Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same.
* Adds 12Mb for the new volume textures
Change 3260956 on 2017/01/17 by Daniel.Wright
Structured buffers for DF object data
* Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads
Change 3261296 on 2017/01/17 by Daniel.Wright
Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures
Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms
Change 3262036 on 2017/01/18 by Ben.Salem
V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details.
Change 3262056 on 2017/01/18 by Chris.Bunner
Remove inverse tonemapping when rendering HDR output.
#jira UE-40728
Change 3262661 on 2017/01/18 by Rolando.Caloca
DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs
- Fix hash for PSOs
Change 3263674 on 2017/01/19 by Chris.Bunner
PR #3144: Improved error messages (Contributed by DarkSlot)
#jira UE-40835
Change 3264150 on 2017/01/19 by Ben.Woodhouse
Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode
#jira UE-40841
Change 3264153 on 2017/01/19 by Ben.Woodhouse
Integrate latest changes from MS-DX12 CLs 3231395-3262526
- Added WinPixEventRuntime.tps
- Includes PIX support, various optimizations (saved 1.3ms in testbed scene)
CL 3262343:
Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes:
- Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case.
CL 3231395:
Update D3D12 RHI:
- Fix deferred MSAA path in RHI
- Add Pix3.h support
- Cleanup SetName usage and remove it from shipping builds.
- Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value.
- Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64
- Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version.
- Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident.
Change 3264251 on 2017/01/19 by Mark.Satterthwaite
Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions.
#jira UE-40803
Change 3264642 on 2017/01/19 by Daniel.Wright
Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096.
Change 3265330 on 2017/01/20 by Ben.Salem
Stop performance plugin from building in Win32.
#tests recompiled and preflighted
Change 3265678 on 2017/01/20 by Marcus.Wassmer
Fix bad declaration.
#3055
Change 3266656 on 2017/01/20 by Mark.Satterthwaite
Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging).
Duplicate & amend CL #3266053 from Trepka:
Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled.
Amend & integrate RCO's CL #3197085.
Change 3267741 on 2017/01/23 by Rolando.Caloca
DR - Detect duplicated shader and pipeline types
Change 3268600 on 2017/01/23 by Uriel.Doyon
Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings.
Integrated CL 3227368 from Orion stream
Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months.
Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing,
Added th MaxEffectiveScreenSize settings in the investigate texture command.
Change 3269512 on 2017/01/24 by Richard.Wallis
Fix for shader binary cache uncompress data size during internal shader log.
Change 3271237 on 2017/01/25 by Ben.Woodhouse
D3D12 updateTexture2D crash fix
#jira UE-41059
Change 3271564 on 2017/01/25 by Olaf.Piesche
#jira UE-40980
#udn 325525
Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe
Change 3271594 on 2017/01/25 by Ben.Woodhouse
ESRAM support stage 1:
Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range.
Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed
Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage
Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR
#fyi rolando.caloca
Change 3272616 on 2017/01/25 by Rolando.Caloca
DR - Update shader version
Change 3273138 on 2017/01/26 by Ben.Woodhouse
Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main)
[CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
2020-09-08 17:44:06 -04:00
const uint32 PageGridDim = FMath : : DivideAndRoundUp ( ClipmapResolution , GGlobalDistanceFieldPageResolution ) ;
const uint32 PageGridSize = PageGridDim * PageGridDim * PageGridDim ;
const FIntVector PageGridResolution ( PageGridDim , PageGridDim , PageGridDim ) ;
2020-07-06 18:58:26 -04:00
2020-09-08 17:44:06 -04:00
const FVector PageTileWorldExtent = ClipmapVoxelExtent * GGlobalDistanceFieldPageResolutionInAtlas ;
const FVector PageTileWorldExtentWithoutBorders = ClipmapVoxelExtent * GGlobalDistanceFieldPageResolution ;
const FVector PageGridCoordToWorldCenterScale = ClipmapSize / FVector ( PageGridResolution ) ;
const FVector PageGridCoordToWorldCenterBias = Clipmap . Bounds . Min + 0.5f * PageGridCoordToWorldCenterScale ;
2020-07-06 18:58:26 -04:00
2022-04-22 19:55:41 -04:00
const FIntVector CullGridResolution = FComputeShaderUtils : : GetGroupCount ( PageGridResolution , GlobalDistanceField : : CullGridFactor ) ;
const int32 CullGridSize = CullGridResolution . X * CullGridResolution . Y * CullGridResolution . Z ;
const FVector CullTileWorldExtent = ClipmapVoxelExtent * GGlobalDistanceFieldPageResolutionInAtlas * GlobalDistanceField : : CullGridFactor ;
const FVector CullGridCoordToWorldCenterScale = ClipmapSize / FVector ( CullGridResolution ) ;
const FVector CullGridCoordToWorldCenterBias = Clipmap . Bounds . Min + 0.5 * CullGridCoordToWorldCenterScale ;
2020-09-08 17:44:06 -04:00
FRDGBufferRef PageUpdateTileBuffer = GraphBuilder . CreateBuffer ( FRDGBufferDesc : : CreateStructuredDesc ( sizeof ( uint32 ) , PageGridSize ) , TEXT ( " PageUpdateTiles " ) ) ;
FRDGBufferRef PageComposeTileBuffer = GraphBuilder . CreateBuffer ( FRDGBufferDesc : : CreateStructuredDesc ( sizeof ( uint32 ) , PageGridSize ) , TEXT ( " PageComposeTiles " ) ) ;
FRDGBufferRef PageComposeHeightfieldTileBuffer = GraphBuilder . CreateBuffer ( FRDGBufferDesc : : CreateStructuredDesc ( sizeof ( uint32 ) , PageGridSize ) , TEXT ( " PageComposeHeightfieldTiles " ) ) ;
2022-04-22 19:55:41 -04:00
FRDGBufferRef CullGridUpdateTileBuffer = GraphBuilder . CreateBuffer ( FRDGBufferDesc : : CreateStructuredDesc ( sizeof ( uint32 ) , CullGridSize ) , TEXT ( " CullGridUpdateTiles " ) ) ;
2020-07-06 18:58:26 -04:00
2020-09-08 17:44:06 -04:00
FRDGBufferRef PageUpdateIndirectArgBuffer = GraphBuilder . CreateBuffer ( FRDGBufferDesc : : CreateIndirectDesc < FRHIDispatchIndirectParameters > ( 1 ) , TEXT ( " PageUpdateIndirectArgs " ) ) ;
2022-04-22 19:55:41 -04:00
FRDGBufferRef CullGridUpdateIndirectArgBuffer = GraphBuilder . CreateBuffer ( FRDGBufferDesc : : CreateIndirectDesc < FRHIDispatchIndirectParameters > ( 1 ) , TEXT ( " CullGridUpdateIndirectArgs " ) ) ;
2020-09-08 17:44:06 -04:00
FRDGBufferRef PageComposeIndirectArgBuffer = GraphBuilder . CreateBuffer ( FRDGBufferDesc : : CreateIndirectDesc < FRHIDispatchIndirectParameters > ( 1 ) , TEXT ( " PageComposeIndirectArgs " ) ) ;
FRDGBufferRef PageComposeHeightfieldIndirectArgBuffer = GraphBuilder . CreateBuffer ( FRDGBufferDesc : : CreateIndirectDesc < FRHIDispatchIndirectParameters > ( 1 ) , TEXT ( " PageComposeHeightfieldIndirectArgs " ) ) ;
2020-07-06 18:58:26 -04:00
// Clear indirect dispatch arguments
{
FClearIndirectArgBufferCS : : FParameters * PassParameters = GraphBuilder . AllocParameters < FClearIndirectArgBufferCS : : FParameters > ( ) ;
2020-09-08 17:44:06 -04:00
PassParameters - > RWPageUpdateIndirectArgBuffer = GraphBuilder . CreateUAV ( PageUpdateIndirectArgBuffer , PF_R32_UINT ) ;
2022-04-22 19:55:41 -04:00
PassParameters - > RWCullGridUpdateIndirectArgBuffer = GraphBuilder . CreateUAV ( CullGridUpdateIndirectArgBuffer , PF_R32_UINT ) ;
2020-09-08 17:44:06 -04:00
PassParameters - > RWPageComposeIndirectArgBuffer = GraphBuilder . CreateUAV ( PageComposeIndirectArgBuffer , PF_R32_UINT ) ;
2020-07-06 18:58:26 -04:00
auto ComputeShader = View . ShaderMap - > GetShader < FClearIndirectArgBufferCS > ( ) ;
FComputeShaderUtils : : AddPass (
GraphBuilder ,
RDG_EVENT_NAME ( " ClearIndirectArgBuffer " ) ,
ComputeShader ,
PassParameters ,
FIntVector ( 1 , 1 , 1 ) ) ;
}
2020-09-08 17:44:06 -04:00
// Prepare page tiles which need to be updated for update regions
2020-07-06 18:58:26 -04:00
{
FBuildGridTilesCS : : FParameters * PassParameters = GraphBuilder . AllocParameters < FBuildGridTilesCS : : FParameters > ( ) ;
2020-09-08 17:44:06 -04:00
PassParameters - > View = View . ViewUniformBuffer ;
2022-04-22 19:55:41 -04:00
PassParameters - > RWPageTileBuffer = GraphBuilder . CreateUAV ( PageUpdateTileBuffer , PF_R32_UINT ) ;
PassParameters - > RWPageIndirectArgBuffer = GraphBuilder . CreateUAV ( PageUpdateIndirectArgBuffer , PF_R32_UINT ) ;
PassParameters - > RWCullGridTileBuffer = GraphBuilder . CreateUAV ( CullGridUpdateTileBuffer , PF_R32_UINT ) ;
PassParameters - > RWCullGridIndirectArgBuffer = GraphBuilder . CreateUAV ( CullGridUpdateIndirectArgBuffer , PF_R32_UINT ) ;
2020-07-06 18:58:26 -04:00
PassParameters - > UpdateBoundsBuffer = GraphBuilder . CreateSRV ( UpdateBoundsBuffer , PF_A32B32G32R32F ) ;
PassParameters - > NumUpdateBounds = NumUpdateBounds ;
2020-09-08 17:44:06 -04:00
PassParameters - > InfluenceRadiusSq = ClipmapInfluenceRadius * ClipmapInfluenceRadius ;
2022-04-22 19:55:41 -04:00
// Page grid
PassParameters - > PageGridResolution = PageGridResolution ;
PassParameters - > PageGridCoordToWorldCenterScale = ( FVector3f ) PageGridCoordToWorldCenterScale ;
PassParameters - > PageGridCoordToWorldCenterBias = ( FVector3f ) PageGridCoordToWorldCenterBias ;
PassParameters - > PageGridTileWorldExtent = ( FVector3f ) PageTileWorldExtent ;
// Cull grid
PassParameters - > CullGridResolution = CullGridResolution ;
PassParameters - > CullGridCoordToWorldCenterScale = ( FVector3f ) CullGridCoordToWorldCenterScale ;
PassParameters - > CullGridCoordToWorldCenterBias = ( FVector3f ) CullGridCoordToWorldCenterBias ;
PassParameters - > CullGridTileWorldExtent = ( FVector3f ) CullTileWorldExtent ;
2020-07-06 18:58:26 -04:00
auto ComputeShader = View . ShaderMap - > GetShader < FBuildGridTilesCS > ( ) ;
2022-04-22 19:55:41 -04:00
const FIntVector GroupSize = FComputeShaderUtils : : GetGroupCount ( PageGridResolution , FBuildGridTilesCS : : GetGroupSize ( ) ) ;
2020-07-06 18:58:26 -04:00
FComputeShaderUtils : : AddPass (
GraphBuilder ,
2020-09-08 17:44:06 -04:00
RDG_EVENT_NAME ( " BuildPageUpdateTiles %d " , NumUpdateBounds ) ,
2020-07-06 18:58:26 -04:00
ComputeShader ,
PassParameters ,
GroupSize ) ;
}
2020-09-08 17:44:06 -04:00
// Mark pages which contain a heightfield
FRDGBufferRef MarkedHeightfieldPageBuffer = nullptr ;
if ( UpdateRegionHeightfield . ComponentDescriptions . Num ( ) > 0 )
2020-07-06 18:58:26 -04:00
{
2020-09-08 17:44:06 -04:00
RDG_EVENT_SCOPE ( GraphBuilder , " HeightfieldPageAllocation " ) ;
2020-07-06 18:58:26 -04:00
2020-09-08 17:44:06 -04:00
MarkedHeightfieldPageBuffer = GraphBuilder . CreateBuffer ( FRDGBufferDesc : : CreateStructuredDesc ( sizeof ( uint32 ) , PageGridSize ) , TEXT ( " MarkedHeightfieldPages " ) ) ;
AddClearUAVPass ( GraphBuilder , GraphBuilder . CreateUAV ( MarkedHeightfieldPageBuffer , PF_R32_UINT ) , 0 ) ;
2020-07-06 18:58:26 -04:00
2020-09-08 17:44:06 -04:00
const FVector PageVoxelExtent = 0.5f * ClipmapSize / FVector ( ClipmapResolution ) ;
const FVector PageCoordToVoxelCenterScale = ClipmapSize / FVector ( ClipmapResolution ) ;
const FVector PageCoordToVoxelCenterBias = Clipmap . Bounds . Min + PageVoxelExtent ;
2020-07-06 18:58:26 -04:00
2020-09-08 17:44:06 -04:00
for ( TMap < FHeightfieldComponentTextures , TArray < FHeightfieldComponentDescription > > : : TConstIterator It ( UpdateRegionHeightfield . ComponentDescriptions ) ; It ; + + It )
{
const TArray < FHeightfieldComponentDescription > & HeightfieldDescriptions = It . Value ( ) ;
if ( HeightfieldDescriptions . Num ( ) > 0 )
{
2022-02-08 14:04:52 -05:00
FRDGBufferRef HeightfieldDescriptionBuffer = UploadHeightfieldDescriptions ( GraphBuilder , HeightfieldDescriptions ) ;
2020-09-08 17:44:06 -04:00
UTexture2D * HeightfieldTexture = It . Key ( ) . HeightAndNormal ;
UTexture2D * VisibilityTexture = It . Key ( ) . Visibility ;
FMarkHeightfieldPagesCS : : FParameters * PassParameters = GraphBuilder . AllocParameters < FMarkHeightfieldPagesCS : : FParameters > ( ) ;
PassParameters - > View = View . ViewUniformBuffer ;
PassParameters - > RWMarkedHeightfieldPageBuffer = GraphBuilder . CreateUAV ( MarkedHeightfieldPageBuffer , PF_R32_UINT ) ;
PassParameters - > PageUpdateIndirectArgBuffer = PageUpdateIndirectArgBuffer ;
PassParameters - > PageUpdateTileBuffer = GraphBuilder . CreateSRV ( PageUpdateTileBuffer , PF_R32_UINT ) ;
PassParameters - > InfluenceRadius = ClipmapInfluenceRadius ;
2022-02-02 07:59:31 -05:00
PassParameters - > PageCoordToPageWorldCenterScale = ( FVector3f ) PageGridCoordToWorldCenterScale ;
PassParameters - > PageCoordToPageWorldCenterBias = ( FVector3f ) PageGridCoordToWorldCenterBias ;
PassParameters - > PageWorldExtent = ( FVector3f ) PageTileWorldExtentWithoutBorders ;
2020-09-08 17:44:06 -04:00
PassParameters - > ClipmapVoxelExtent = ClipmapVoxelExtent . X ;
PassParameters - > PageGridResolution = PageGridResolution ;
PassParameters - > NumHeightfields = HeightfieldDescriptions . Num ( ) ;
PassParameters - > InfluenceRadius = ClipmapInfluenceRadius ;
PassParameters - > HeightfieldThickness = ClipmapVoxelSize . X * GGlobalDistanceFieldHeightFieldThicknessScale ;
2021-05-14 07:17:32 -04:00
PassParameters - > HeightfieldTexture = HeightfieldTexture - > GetResource ( ) - > TextureRHI ;
2020-09-08 17:44:06 -04:00
PassParameters - > HeightfieldSampler = TStaticSamplerState < SF_Bilinear > : : GetRHI ( ) ;
2021-05-14 07:17:32 -04:00
PassParameters - > VisibilityTexture = VisibilityTexture ? VisibilityTexture - > GetResource ( ) - > TextureRHI : GBlackTexture - > TextureRHI ;
2020-09-08 17:44:06 -04:00
PassParameters - > VisibilitySampler = TStaticSamplerState < SF_Bilinear > : : GetRHI ( ) ;
PassParameters - > HeightfieldDescriptions = GraphBuilder . CreateSRV ( HeightfieldDescriptionBuffer , EPixelFormat : : PF_A32B32G32R32F ) ;
auto ComputeShader = View . ShaderMap - > GetShader < FMarkHeightfieldPagesCS > ( ) ;
FComputeShaderUtils : : AddPass (
GraphBuilder ,
RDG_EVENT_NAME ( " MarkHeightfieldPages " ) ,
ComputeShader ,
PassParameters ,
PageUpdateIndirectArgBuffer ,
0 ) ;
}
}
// Build heightfield page compose tile buffer
{
FRDGBufferRef BuildHeightfieldComposeTilesIndirectArgBuffer = GraphBuilder . CreateBuffer ( FRDGBufferDesc : : CreateIndirectDesc < FRHIDispatchIndirectParameters > ( 1 ) , TEXT ( " BuildHeightfieldComposeTilesIndirectArgs " ) ) ;
{
FBuildHeightfieldComposeTilesIndirectArgBufferCS : : FParameters * PassParameters = GraphBuilder . AllocParameters < FBuildHeightfieldComposeTilesIndirectArgBufferCS : : FParameters > ( ) ;
PassParameters - > RWBuildHeightfieldComposeTilesIndirectArgBuffer = GraphBuilder . CreateUAV ( BuildHeightfieldComposeTilesIndirectArgBuffer , PF_R32_UINT ) ;
PassParameters - > RWPageComposeHeightfieldIndirectArgBuffer = GraphBuilder . CreateUAV ( PageComposeHeightfieldIndirectArgBuffer , PF_R32_UINT ) ;
PassParameters - > PageUpdateIndirectArgBuffer = GraphBuilder . CreateSRV ( PageUpdateIndirectArgBuffer , PF_R32_UINT ) ;
auto ComputeShader = View . ShaderMap - > GetShader < FBuildHeightfieldComposeTilesIndirectArgBufferCS > ( ) ;
FComputeShaderUtils : : AddPass (
GraphBuilder ,
RDG_EVENT_NAME ( " BuildHeightfieldComposeTilesIndirectArgs " ) ,
ComputeShader ,
PassParameters ,
FIntVector ( 1 , 1 , 1 ) ) ;
}
{
FBuildHeightfieldComposeTilesCS : : FParameters * PassParameters = GraphBuilder . AllocParameters < FBuildHeightfieldComposeTilesCS : : FParameters > ( ) ;
PassParameters - > View = View . ViewUniformBuffer ;
PassParameters - > RWPageComposeHeightfieldIndirectArgBuffer = GraphBuilder . CreateUAV ( PageComposeHeightfieldIndirectArgBuffer , PF_R32_UINT ) ;
PassParameters - > RWPageComposeHeightfieldTileBuffer = GraphBuilder . CreateUAV ( PageComposeHeightfieldTileBuffer , PF_R32_UINT ) ; ;
PassParameters - > PageUpdateTileBuffer = GraphBuilder . CreateSRV ( PageUpdateTileBuffer , PF_R32_UINT ) ;
PassParameters - > MarkedHeightfieldPageBuffer = GraphBuilder . CreateSRV ( MarkedHeightfieldPageBuffer , PF_R32_UINT ) ;
PassParameters - > PageUpdateIndirectArgBuffer = GraphBuilder . CreateSRV ( PageUpdateIndirectArgBuffer , PF_R32_UINT ) ;
PassParameters - > BuildHeightfieldComposeTilesIndirectArgBuffer = BuildHeightfieldComposeTilesIndirectArgBuffer ;
auto ComputeShader = View . ShaderMap - > GetShader < FBuildHeightfieldComposeTilesCS > ( ) ;
FComputeShaderUtils : : AddPass (
GraphBuilder ,
RDG_EVENT_NAME ( " BuildHeightfieldComposeTiles " ) ,
ComputeShader ,
PassParameters ,
BuildHeightfieldComposeTilesIndirectArgBuffer ,
0 ) ;
}
}
2020-07-06 18:58:26 -04:00
}
2022-04-22 19:55:41 -04:00
const uint32 AverageCulledObjectsPerPage = FMath : : Clamp ( CVarAOGlobalDistanceFieldAverageCulledObjectsPerCell . GetValueOnRenderThread ( ) , 1 , 8192 ) ;
2020-09-08 17:44:06 -04:00
FRDGBufferRef CullGridAllocator = GraphBuilder . CreateBuffer ( FRDGBufferDesc : : CreateStructuredDesc ( sizeof ( uint32 ) , 1 ) , TEXT ( " CullGridAllocator " ) ) ;
2022-04-22 19:55:41 -04:00
FRDGBufferRef CullGridObjectHeader = GraphBuilder . CreateBuffer ( FRDGBufferDesc : : CreateStructuredDesc ( sizeof ( uint32 ) , 2 * CullGridSize ) , TEXT ( " CullGridObjectHeader " ) ) ;
FRDGBufferRef CullGridObjectArray = GraphBuilder . CreateBuffer ( FRDGBufferDesc : : CreateStructuredDesc ( sizeof ( uint32 ) , CullGridSize * AverageCulledObjectsPerPage ) , TEXT ( " CullGridObjectArray " ) ) ;
2020-07-06 18:58:26 -04:00
2022-04-26 14:37:07 -04:00
FDistanceFieldObjectBufferParameters DistanceFieldObjectBuffers = DistanceField : : SetupObjectBufferParameters ( GraphBuilder , DistanceFieldSceneData ) ;
FDistanceFieldAtlasParameters DistanceFieldAtlas = DistanceField : : SetupAtlasParameters ( GraphBuilder , DistanceFieldSceneData ) ;
Sparse, narrow band, streamed Mesh Signed Distance Fields
* SDFs are now generated, allocated from the atlas and uploaded in 8^3 bricks (7^3 unique data, half voxel padding).
* Tracing must load the brick index from the indirection table, and only bricks near the surface are stored
* 3 mips are now generated, with the lowest resolution always loaded and the other 2 streamed
* SDFs are now G8 narrow band. Lower resolution mips must be traversed when querying distance to nearest surface far away from the surface
* The Distance Field Brick Atlas is now stored for each FScene and dynamically resized based on needs with a GPU memcopy
* Brick atlas uses a 1d pooled allocator which has no fragmentation and greatly reduces packing waste over the 3d allocator
* Added new indirection for Distance Field Asset data, so that only a single entry needs to be updated when a mip is streamed in or out in scenes with millions of instances
* Compute shaders operating on distance field instances generate streaming requests, which are async read back to CPU, turned into IO requests, which are polled and when complete uploaded to atlases
* Any mesh instance inside the Global SDF extent (200m) requests mip1, and at 50m requests mip2
* Now using a batched compute scatter to upload to the distance field atlas instead of RHIUpdateTexture3d, to bypass alignment restrictions and per-upload overhead
* Distance Field streaming uses an async task to move Memcpy and IO request overhead off of the Rendering Thread
* Distance Field Visualization now computes a normal from the SDF gradient and does simple lighting to better visualize the scene representation
* Increased r.DistanceFields.MaxPerMeshResolution from 128 to 512, to better represent large objects
* Mesh SDF generation now uses an Embree point query to calculate closest unsigned distance, and then a much smaller set of rays to count backfaces for negative region determination, for a 11x speedup
* Upgraded mesh utilities to Embree 3.12.2 to get point queries
* Fixed wrong transform used for SDF normals in Lumen, causing non-uniformly scaled meshes to have incorrect Surface Cache interpolation
* Fixed Static Mesh materials not getting PostLoaded before SDF build, causing their blend modes to be wrong for the build, which corrupts the DDC. Also included those blend modes in the DDC key.
Original costs on 1080 GTX (full updates on everything and no screen traces)
10.60ms UpdateGlobalDistanceField
3.62ms LumenReflectiveTest.DirectionalLight_1 Shadowmap 1
1.73ms VoxelizeCards Clipmaps=[0,1,2,3]
0.38ms TraceCards 1 dispatch 1 groups
0.51ms TraceCards 1 dispatch 1 groups
Sparse SDF costs
12.06ms UpdateGlobalDistanceField
4.35ms LumenReflectiveTest.DirectionalLight_1 Shadowmap 1
2.30ms VoxelizeCards Clipmaps=[0,1,2,3]
0.69ms TraceCards 1 dispatch 1 groups
0.77ms TraceCards 1 dispatch 1 groups
Tested: TopazEntry PC, Reverb PC and PS5, EngineTests, QAGame, Rift, Frosty P_Construct_WP, FortGPUTestbed
#rb Krzysztof.Narkowicz
#ROBOMERGE-OWNER: Daniel.Wright
#ROBOMERGE-AUTHOR: daniel.wright
#ROBOMERGE-SOURCE: CL 15784493 in //UE5/Release-5.0-EarlyAccess/...
#ROBOMERGE-BOT: STARSHIP (Release-5.0-EarlyAccess -> Main) (v783-15756269)
#ROBOMERGE-CONFLICT from-shelf
[CL 15790658 by Daniel Wright in ue5-main branch]
2021-03-23 22:40:05 -04:00
2020-07-06 18:58:26 -04:00
// Cull objects into a cull grid
2022-01-31 04:59:02 -05:00
if ( Scene - > DistanceFieldSceneData . NumObjectsInBuffer > 0 )
2020-07-06 18:58:26 -04:00
{
AddClearUAVPass ( GraphBuilder , GraphBuilder . CreateUAV ( CullGridAllocator , PF_R32_UINT ) , 0 ) ;
AddClearUAVPass ( GraphBuilder , GraphBuilder . CreateUAV ( CullGridObjectHeader , PF_R32_UINT ) , 0 ) ;
FCullObjectsToGridCS : : FParameters * PassParameters = GraphBuilder . AllocParameters < FCullObjectsToGridCS : : FParameters > ( ) ;
PassParameters - > RWCullGridAllocator = GraphBuilder . CreateUAV ( CullGridAllocator , PF_R32_UINT ) ;
PassParameters - > RWCullGridObjectHeader = GraphBuilder . CreateUAV ( CullGridObjectHeader , PF_R32_UINT ) ;
PassParameters - > RWCullGridObjectArray = GraphBuilder . CreateUAV ( CullGridObjectArray , PF_R32_UINT ) ;
2022-04-22 19:55:41 -04:00
PassParameters - > CullGridIndirectArgBuffer = CullGridUpdateIndirectArgBuffer ;
PassParameters - > CullGridTileBuffer = GraphBuilder . CreateSRV ( CullGridUpdateTileBuffer , PF_R32_UINT ) ;
2020-07-06 18:58:26 -04:00
PassParameters - > ObjectIndexBuffer = GraphBuilder . CreateSRV ( ObjectIndexBuffer , PF_R32_UINT ) ;
PassParameters - > ObjectIndexNumBuffer = GraphBuilder . CreateSRV ( ObjectIndexNumBuffer , PF_R32_UINT ) ;
Sparse, narrow band, streamed Mesh Signed Distance Fields
* SDFs are now generated, allocated from the atlas and uploaded in 8^3 bricks (7^3 unique data, half voxel padding).
* Tracing must load the brick index from the indirection table, and only bricks near the surface are stored
* 3 mips are now generated, with the lowest resolution always loaded and the other 2 streamed
* SDFs are now G8 narrow band. Lower resolution mips must be traversed when querying distance to nearest surface far away from the surface
* The Distance Field Brick Atlas is now stored for each FScene and dynamically resized based on needs with a GPU memcopy
* Brick atlas uses a 1d pooled allocator which has no fragmentation and greatly reduces packing waste over the 3d allocator
* Added new indirection for Distance Field Asset data, so that only a single entry needs to be updated when a mip is streamed in or out in scenes with millions of instances
* Compute shaders operating on distance field instances generate streaming requests, which are async read back to CPU, turned into IO requests, which are polled and when complete uploaded to atlases
* Any mesh instance inside the Global SDF extent (200m) requests mip1, and at 50m requests mip2
* Now using a batched compute scatter to upload to the distance field atlas instead of RHIUpdateTexture3d, to bypass alignment restrictions and per-upload overhead
* Distance Field streaming uses an async task to move Memcpy and IO request overhead off of the Rendering Thread
* Distance Field Visualization now computes a normal from the SDF gradient and does simple lighting to better visualize the scene representation
* Increased r.DistanceFields.MaxPerMeshResolution from 128 to 512, to better represent large objects
* Mesh SDF generation now uses an Embree point query to calculate closest unsigned distance, and then a much smaller set of rays to count backfaces for negative region determination, for a 11x speedup
* Upgraded mesh utilities to Embree 3.12.2 to get point queries
* Fixed wrong transform used for SDF normals in Lumen, causing non-uniformly scaled meshes to have incorrect Surface Cache interpolation
* Fixed Static Mesh materials not getting PostLoaded before SDF build, causing their blend modes to be wrong for the build, which corrupts the DDC. Also included those blend modes in the DDC key.
Original costs on 1080 GTX (full updates on everything and no screen traces)
10.60ms UpdateGlobalDistanceField
3.62ms LumenReflectiveTest.DirectionalLight_1 Shadowmap 1
1.73ms VoxelizeCards Clipmaps=[0,1,2,3]
0.38ms TraceCards 1 dispatch 1 groups
0.51ms TraceCards 1 dispatch 1 groups
Sparse SDF costs
12.06ms UpdateGlobalDistanceField
4.35ms LumenReflectiveTest.DirectionalLight_1 Shadowmap 1
2.30ms VoxelizeCards Clipmaps=[0,1,2,3]
0.69ms TraceCards 1 dispatch 1 groups
0.77ms TraceCards 1 dispatch 1 groups
Tested: TopazEntry PC, Reverb PC and PS5, EngineTests, QAGame, Rift, Frosty P_Construct_WP, FortGPUTestbed
#rb Krzysztof.Narkowicz
#ROBOMERGE-OWNER: Daniel.Wright
#ROBOMERGE-AUTHOR: daniel.wright
#ROBOMERGE-SOURCE: CL 15784493 in //UE5/Release-5.0-EarlyAccess/...
#ROBOMERGE-BOT: STARSHIP (Release-5.0-EarlyAccess -> Main) (v783-15756269)
#ROBOMERGE-CONFLICT from-shelf
[CL 15790658 by Daniel Wright in ue5-main branch]
2021-03-23 22:40:05 -04:00
PassParameters - > DistanceFieldObjectBuffers = DistanceFieldObjectBuffers ;
2022-04-22 19:55:41 -04:00
PassParameters - > CullGridResolution = CullGridResolution ;
PassParameters - > CullGridCoordToWorldCenterScale = ( FVector3f ) CullGridCoordToWorldCenterScale ;
PassParameters - > CullGridCoordToWorldCenterBias = ( FVector3f ) CullGridCoordToWorldCenterBias ;
PassParameters - > CullTileWorldExtent = ( FVector3f ) CullTileWorldExtent ;
2020-09-08 17:44:06 -04:00
PassParameters - > InfluenceRadiusSq = ClipmapInfluenceRadius * ClipmapInfluenceRadius ;
2020-07-06 18:58:26 -04:00
auto ComputeShader = View . ShaderMap - > GetShader < FCullObjectsToGridCS > ( ) ;
FComputeShaderUtils : : AddPass (
GraphBuilder ,
RDG_EVENT_NAME ( " CullObjectsToGrid " ) ,
ComputeShader ,
PassParameters ,
2022-04-22 19:55:41 -04:00
CullGridUpdateIndirectArgBuffer ,
2020-07-06 18:58:26 -04:00
0 ) ;
}
2020-09-08 17:44:06 -04:00
// Allocate and build page lists
2020-07-06 18:58:26 -04:00
{
2020-09-08 17:44:06 -04:00
FRDGBufferRef PageFreeListReturnAllocatorBuffer = GraphBuilder . CreateBuffer ( FRDGBufferDesc : : CreateStructuredDesc ( sizeof ( uint32 ) , 1 ) , TEXT ( " PageFreeListReturnAllocator " ) ) ;
2022-01-26 17:07:27 -05:00
FRDGBufferRef PageFreeListReturnBuffer = GraphBuilder . CreateBuffer ( FRDGBufferDesc : : CreateStructuredDesc ( sizeof ( uint32 ) , GlobalDistanceField : : GetMaxPageNum ( bLumenEnabled , View . FinalPostProcessSettings . LumenSceneViewDistance ) ) , TEXT ( " PageFreeListReturn " ) ) ;
2020-07-06 18:58:26 -04:00
2020-09-08 17:44:06 -04:00
AddClearUAVPass ( GraphBuilder , GraphBuilder . CreateUAV ( PageFreeListReturnAllocatorBuffer , PF_R32_UINT ) , 0 ) ;
2020-07-06 18:58:26 -04:00
2020-09-08 17:44:06 -04:00
// Allocate pages for objects
{
FAllocatePagesCS : : FParameters * PassParameters = GraphBuilder . AllocParameters < FAllocatePagesCS : : FParameters > ( ) ;
PassParameters - > View = View . ViewUniformBuffer ;
PassParameters - > PageUpdateIndirectArgBuffer = PageUpdateIndirectArgBuffer ;
PassParameters - > PageUpdateTileBuffer = GraphBuilder . CreateSRV ( PageUpdateTileBuffer , PF_R32_UINT ) ;
PassParameters - > MarkedHeightfieldPageBuffer = MarkedHeightfieldPageBuffer ? GraphBuilder . CreateSRV ( MarkedHeightfieldPageBuffer , PF_R32_UINT ) : nullptr ;
2022-02-08 16:53:35 -05:00
PassParameters - > RWPageTableCombinedTexture = PageTableCombinedTexture ? GraphBuilder . CreateUAV ( PageTableCombinedTexture ) : nullptr ;
2020-09-08 17:44:06 -04:00
PassParameters - > RWPageTableLayerTexture = GraphBuilder . CreateUAV ( PageTableLayerTexture ) ;
PassParameters - > RWPageFreeListAllocatorBuffer = GraphBuilder . CreateUAV ( PageFreeListAllocatorBuffer , PF_R32_SINT ) ;
PassParameters - > PageFreeListBuffer = GraphBuilder . CreateSRV ( PageFreeListBuffer , PF_R32_UINT ) ;
PassParameters - > RWPageFreeListReturnAllocatorBuffer = GraphBuilder . CreateUAV ( PageFreeListReturnAllocatorBuffer , PF_R32_UINT ) ;
PassParameters - > RWPageFreeListReturnBuffer = GraphBuilder . CreateUAV ( PageFreeListReturnBuffer , PF_R32_UINT ) ;
PassParameters - > RWPageComposeTileBuffer = GraphBuilder . CreateUAV ( PageComposeTileBuffer , PF_R32_UINT ) ;
PassParameters - > RWPageComposeIndirectArgBuffer = GraphBuilder . CreateUAV ( PageComposeIndirectArgBuffer , PF_R32_UINT ) ;
PassParameters - > ParentPageTableLayerTexture = ParentPageTableLayerTexture ;
2022-02-02 07:59:31 -05:00
PassParameters - > PageWorldExtent = ( FVector3f ) PageTileWorldExtentWithoutBorders ;
2020-09-08 17:44:06 -04:00
PassParameters - > PageWorldRadius = PageTileWorldExtentWithoutBorders . Size ( ) ;
PassParameters - > ClipmapInfluenceRadius = ClipmapInfluenceRadius ;
PassParameters - > PageGridResolution = PageGridResolution ;
2022-02-02 07:59:31 -05:00
PassParameters - > InvPageGridResolution = FVector3f : : OneVector / ( FVector3f ) PageGridResolution ;
2020-09-08 17:44:06 -04:00
PassParameters - > GlobalDistanceFieldMaxPageNum = GGlobalDistanceFieldMaxPageNum ;
2022-02-02 07:59:31 -05:00
PassParameters - > PageCoordToPageWorldCenterScale = ( FVector3f ) PageGridCoordToWorldCenterScale ;
PassParameters - > PageCoordToPageWorldCenterBias = ( FVector3f ) PageGridCoordToWorldCenterBias ;
2020-09-08 17:44:06 -04:00
PassParameters - > ClipmapVolumeWorldToUVAddAndMul = ClipmapVolumeWorldToUVAddAndMul ;
PassParameters - > PageTableClipmapOffsetZ = ClipmapIndex * PageGridResolution . Z ;
PassParameters - > CullGridObjectHeader = GraphBuilder . CreateSRV ( CullGridObjectHeader , PF_R32_UINT ) ;
PassParameters - > CullGridObjectArray = GraphBuilder . CreateSRV ( CullGridObjectArray , PF_R32_UINT ) ;
2022-04-22 19:55:41 -04:00
PassParameters - > CullGridResolution = CullGridResolution ;
2020-09-08 17:44:06 -04:00
Sparse, narrow band, streamed Mesh Signed Distance Fields
* SDFs are now generated, allocated from the atlas and uploaded in 8^3 bricks (7^3 unique data, half voxel padding).
* Tracing must load the brick index from the indirection table, and only bricks near the surface are stored
* 3 mips are now generated, with the lowest resolution always loaded and the other 2 streamed
* SDFs are now G8 narrow band. Lower resolution mips must be traversed when querying distance to nearest surface far away from the surface
* The Distance Field Brick Atlas is now stored for each FScene and dynamically resized based on needs with a GPU memcopy
* Brick atlas uses a 1d pooled allocator which has no fragmentation and greatly reduces packing waste over the 3d allocator
* Added new indirection for Distance Field Asset data, so that only a single entry needs to be updated when a mip is streamed in or out in scenes with millions of instances
* Compute shaders operating on distance field instances generate streaming requests, which are async read back to CPU, turned into IO requests, which are polled and when complete uploaded to atlases
* Any mesh instance inside the Global SDF extent (200m) requests mip1, and at 50m requests mip2
* Now using a batched compute scatter to upload to the distance field atlas instead of RHIUpdateTexture3d, to bypass alignment restrictions and per-upload overhead
* Distance Field streaming uses an async task to move Memcpy and IO request overhead off of the Rendering Thread
* Distance Field Visualization now computes a normal from the SDF gradient and does simple lighting to better visualize the scene representation
* Increased r.DistanceFields.MaxPerMeshResolution from 128 to 512, to better represent large objects
* Mesh SDF generation now uses an Embree point query to calculate closest unsigned distance, and then a much smaller set of rays to count backfaces for negative region determination, for a 11x speedup
* Upgraded mesh utilities to Embree 3.12.2 to get point queries
* Fixed wrong transform used for SDF normals in Lumen, causing non-uniformly scaled meshes to have incorrect Surface Cache interpolation
* Fixed Static Mesh materials not getting PostLoaded before SDF build, causing their blend modes to be wrong for the build, which corrupts the DDC. Also included those blend modes in the DDC key.
Original costs on 1080 GTX (full updates on everything and no screen traces)
10.60ms UpdateGlobalDistanceField
3.62ms LumenReflectiveTest.DirectionalLight_1 Shadowmap 1
1.73ms VoxelizeCards Clipmaps=[0,1,2,3]
0.38ms TraceCards 1 dispatch 1 groups
0.51ms TraceCards 1 dispatch 1 groups
Sparse SDF costs
12.06ms UpdateGlobalDistanceField
4.35ms LumenReflectiveTest.DirectionalLight_1 Shadowmap 1
2.30ms VoxelizeCards Clipmaps=[0,1,2,3]
0.69ms TraceCards 1 dispatch 1 groups
0.77ms TraceCards 1 dispatch 1 groups
Tested: TopazEntry PC, Reverb PC and PS5, EngineTests, QAGame, Rift, Frosty P_Construct_WP, FortGPUTestbed
#rb Krzysztof.Narkowicz
#ROBOMERGE-OWNER: Daniel.Wright
#ROBOMERGE-AUTHOR: daniel.wright
#ROBOMERGE-SOURCE: CL 15784493 in //UE5/Release-5.0-EarlyAccess/...
#ROBOMERGE-BOT: STARSHIP (Release-5.0-EarlyAccess -> Main) (v783-15756269)
#ROBOMERGE-CONFLICT from-shelf
[CL 15790658 by Daniel Wright in ue5-main branch]
2021-03-23 22:40:05 -04:00
PassParameters - > DistanceFieldObjectBuffers = DistanceFieldObjectBuffers ;
PassParameters - > DistanceFieldAtlas = DistanceFieldAtlas ;
2020-09-08 17:44:06 -04:00
FAllocatePagesCS : : FPermutationDomain PermutationVector ;
2022-01-31 04:59:02 -05:00
PermutationVector . Set < FAllocatePagesCS : : FProcessDistanceFields > ( Scene - > DistanceFieldSceneData . NumObjectsInBuffer > 0 ) ;
2020-09-08 17:44:06 -04:00
PermutationVector . Set < FAllocatePagesCS : : FMarkedHeightfieldPageBuffer > ( MarkedHeightfieldPageBuffer ! = nullptr ) ;
PermutationVector . Set < FAllocatePagesCS : : FComposeParentDistanceField > ( ParentPageTableLayerTexture ! = nullptr ) ;
2022-04-25 07:32:32 -04:00
extern int32 GDistanceFieldOffsetDataStructure ;
PermutationVector . Set < FAllocatePagesCS : : FOffsetDataStructure > ( GDistanceFieldOffsetDataStructure ) ;
2020-09-08 17:44:06 -04:00
auto ComputeShader = View . ShaderMap - > GetShader < FAllocatePagesCS > ( PermutationVector ) ;
FComputeShaderUtils : : AddPass (
GraphBuilder ,
RDG_EVENT_NAME ( " AllocatePages " ) ,
ComputeShader ,
PassParameters ,
PageUpdateIndirectArgBuffer ,
0 ) ;
}
FRDGBufferRef FreeListReturnIndirectArgBuffer = GraphBuilder . CreateBuffer ( FRDGBufferDesc : : CreateIndirectDesc < FRHIDispatchIndirectParameters > ( 1 ) , TEXT ( " FreeListReturnIndirectArgs " ) ) ;
// Setup free list return indirect dispatch arguments
{
FPageFreeListReturnIndirectArgBufferCS : : FParameters * PassParameters = GraphBuilder . AllocParameters < FPageFreeListReturnIndirectArgBufferCS : : FParameters > ( ) ;
PassParameters - > RWFreeListReturnIndirectArgBuffer = GraphBuilder . CreateUAV ( FreeListReturnIndirectArgBuffer , PF_R32_UINT ) ;
PassParameters - > RWPageFreeListAllocatorBuffer = GraphBuilder . CreateUAV ( PageFreeListAllocatorBuffer , PF_R32_SINT ) ;
PassParameters - > PageFreeListReturnAllocatorBuffer = GraphBuilder . CreateSRV ( PageFreeListReturnAllocatorBuffer , PF_R32_UINT ) ;
auto ComputeShader = View . ShaderMap - > GetShader < FPageFreeListReturnIndirectArgBufferCS > ( ) ;
FComputeShaderUtils : : AddPass (
GraphBuilder ,
RDG_EVENT_NAME ( " SetupPageFreeListRetunIndirectArgs " ) ,
ComputeShader ,
PassParameters ,
FIntVector ( 1 , 1 , 1 ) ) ;
}
// Return to the free list
{
FPageFreeListReturnCS : : FParameters * PassParameters = GraphBuilder . AllocParameters < FPageFreeListReturnCS : : FParameters > ( ) ;
PassParameters - > FreeListReturnIndirectArgBuffer = FreeListReturnIndirectArgBuffer ;
PassParameters - > RWPageFreeListAllocatorBuffer = GraphBuilder . CreateUAV ( PageFreeListAllocatorBuffer , PF_R32_SINT ) ;
PassParameters - > RWPageFreeListBuffer = GraphBuilder . CreateUAV ( PageFreeListBuffer , PF_R32_UINT ) ;
PassParameters - > PageFreeListReturnAllocatorBuffer = GraphBuilder . CreateSRV ( PageFreeListReturnAllocatorBuffer , PF_R32_UINT ) ;
PassParameters - > PageFreeListReturnBuffer = GraphBuilder . CreateSRV ( PageFreeListReturnBuffer , PF_R32_UINT ) ;
auto ComputeShader = View . ShaderMap - > GetShader < FPageFreeListReturnCS > ( ) ;
FComputeShaderUtils : : AddPass (
GraphBuilder ,
RDG_EVENT_NAME ( " ReturnToPageFreeList " ) ,
ComputeShader ,
PassParameters ,
FreeListReturnIndirectArgBuffer ,
0 ) ;
}
}
2022-02-02 05:42:31 -05:00
// Initialize pages and compose the mesh SDFs into allocated pages
if ( Scene - > DistanceFieldSceneData . NumObjectsInBuffer > 0 | | UpdateRegionHeightfield . ComponentDescriptions . Num ( ) > 0 )
2020-09-08 17:44:06 -04:00
{
const FVector PageVoxelExtent = 0.5f * ClipmapSize / FVector ( ClipmapResolution ) ;
const FVector PageCoordToVoxelCenterScale = ClipmapSize / FVector ( ClipmapResolution ) ;
const FVector PageCoordToVoxelCenterBias = Clipmap . Bounds . Min + PageVoxelExtent ;
const uint32 PageComposeTileSize = 4 ;
const FVector PageComposeTileWorldExtent = ClipmapVoxelExtent * PageComposeTileSize ;
FComposeObjectsIntoPagesCS : : FParameters * PassParameters = GraphBuilder . AllocParameters < FComposeObjectsIntoPagesCS : : FParameters > ( ) ;
2020-07-06 18:58:26 -04:00
PassParameters - > View = View . ViewUniformBuffer ;
2020-09-08 17:44:06 -04:00
PassParameters - > RWPageAtlasTexture = GraphBuilder . CreateUAV ( PageAtlasTexture ) ;
2022-03-01 21:07:45 -05:00
PassParameters - > RWCoverageAtlasTexture = CoverageAtlasTexture ? GraphBuilder . CreateUAV ( CoverageAtlasTexture ) : nullptr ;
2020-09-08 17:44:06 -04:00
PassParameters - > ComposeIndirectArgBuffer = PageComposeIndirectArgBuffer ;
PassParameters - > ComposeTileBuffer = GraphBuilder . CreateSRV ( PageComposeTileBuffer , PF_R32_UINT ) ;
PassParameters - > PageTableLayerTexture = PageTableLayerTexture ;
PassParameters - > ParentPageTableLayerTexture = ParentPageTableLayerTexture ;
2020-07-06 18:58:26 -04:00
PassParameters - > CullGridObjectHeader = GraphBuilder . CreateSRV ( CullGridObjectHeader , PF_R32_UINT ) ;
PassParameters - > CullGridObjectArray = GraphBuilder . CreateSRV ( CullGridObjectArray , PF_R32_UINT ) ;
PassParameters - > ObjectIndexBuffer = GraphBuilder . CreateSRV ( ObjectIndexBuffer , PF_R32_UINT ) ;
PassParameters - > ObjectIndexNumBuffer = GraphBuilder . CreateSRV ( ObjectIndexNumBuffer , PF_R32_UINT ) ;
Sparse, narrow band, streamed Mesh Signed Distance Fields
* SDFs are now generated, allocated from the atlas and uploaded in 8^3 bricks (7^3 unique data, half voxel padding).
* Tracing must load the brick index from the indirection table, and only bricks near the surface are stored
* 3 mips are now generated, with the lowest resolution always loaded and the other 2 streamed
* SDFs are now G8 narrow band. Lower resolution mips must be traversed when querying distance to nearest surface far away from the surface
* The Distance Field Brick Atlas is now stored for each FScene and dynamically resized based on needs with a GPU memcopy
* Brick atlas uses a 1d pooled allocator which has no fragmentation and greatly reduces packing waste over the 3d allocator
* Added new indirection for Distance Field Asset data, so that only a single entry needs to be updated when a mip is streamed in or out in scenes with millions of instances
* Compute shaders operating on distance field instances generate streaming requests, which are async read back to CPU, turned into IO requests, which are polled and when complete uploaded to atlases
* Any mesh instance inside the Global SDF extent (200m) requests mip1, and at 50m requests mip2
* Now using a batched compute scatter to upload to the distance field atlas instead of RHIUpdateTexture3d, to bypass alignment restrictions and per-upload overhead
* Distance Field streaming uses an async task to move Memcpy and IO request overhead off of the Rendering Thread
* Distance Field Visualization now computes a normal from the SDF gradient and does simple lighting to better visualize the scene representation
* Increased r.DistanceFields.MaxPerMeshResolution from 128 to 512, to better represent large objects
* Mesh SDF generation now uses an Embree point query to calculate closest unsigned distance, and then a much smaller set of rays to count backfaces for negative region determination, for a 11x speedup
* Upgraded mesh utilities to Embree 3.12.2 to get point queries
* Fixed wrong transform used for SDF normals in Lumen, causing non-uniformly scaled meshes to have incorrect Surface Cache interpolation
* Fixed Static Mesh materials not getting PostLoaded before SDF build, causing their blend modes to be wrong for the build, which corrupts the DDC. Also included those blend modes in the DDC key.
Original costs on 1080 GTX (full updates on everything and no screen traces)
10.60ms UpdateGlobalDistanceField
3.62ms LumenReflectiveTest.DirectionalLight_1 Shadowmap 1
1.73ms VoxelizeCards Clipmaps=[0,1,2,3]
0.38ms TraceCards 1 dispatch 1 groups
0.51ms TraceCards 1 dispatch 1 groups
Sparse SDF costs
12.06ms UpdateGlobalDistanceField
4.35ms LumenReflectiveTest.DirectionalLight_1 Shadowmap 1
2.30ms VoxelizeCards Clipmaps=[0,1,2,3]
0.69ms TraceCards 1 dispatch 1 groups
0.77ms TraceCards 1 dispatch 1 groups
Tested: TopazEntry PC, Reverb PC and PS5, EngineTests, QAGame, Rift, Frosty P_Construct_WP, FortGPUTestbed
#rb Krzysztof.Narkowicz
#ROBOMERGE-OWNER: Daniel.Wright
#ROBOMERGE-AUTHOR: daniel.wright
#ROBOMERGE-SOURCE: CL 15784493 in //UE5/Release-5.0-EarlyAccess/...
#ROBOMERGE-BOT: STARSHIP (Release-5.0-EarlyAccess -> Main) (v783-15756269)
#ROBOMERGE-CONFLICT from-shelf
[CL 15790658 by Daniel Wright in ue5-main branch]
2021-03-23 22:40:05 -04:00
PassParameters - > DistanceFieldObjectBuffers = DistanceFieldObjectBuffers ;
PassParameters - > DistanceFieldAtlas = DistanceFieldAtlas ;
2020-09-08 17:44:06 -04:00
PassParameters - > InfluenceRadius = ClipmapInfluenceRadius ;
PassParameters - > InfluenceRadiusSq = ClipmapInfluenceRadius * ClipmapInfluenceRadius ;
PassParameters - > ClipmapVoxelExtent = ClipmapVoxelExtent . X ;
2022-04-22 19:55:41 -04:00
PassParameters - > CullGridResolution = CullGridResolution ;
2020-09-08 17:44:06 -04:00
PassParameters - > PageGridResolution = PageGridResolution ;
2022-02-02 07:59:31 -05:00
PassParameters - > InvPageGridResolution = FVector3f : : OneVector / ( FVector3f ) PageGridResolution ;
2020-09-08 17:44:06 -04:00
PassParameters - > ClipmapResolution = FIntVector ( ClipmapResolution ) ;
2022-02-02 07:59:31 -05:00
PassParameters - > PageCoordToVoxelCenterScale = ( FVector3f ) PageCoordToVoxelCenterScale ;
PassParameters - > PageCoordToVoxelCenterBias = ( FVector3f ) PageCoordToVoxelCenterBias ;
PassParameters - > ComposeTileWorldExtent = ( FVector3f ) PageComposeTileWorldExtent ;
PassParameters - > ClipmapMinBounds = ( FVector3f ) Clipmap . Bounds . Min ;
PassParameters - > PageCoordToPageWorldCenterScale = ( FVector3f ) PageGridCoordToWorldCenterScale ;
PassParameters - > PageCoordToPageWorldCenterBias = ( FVector3f ) PageGridCoordToWorldCenterBias ;
2020-09-08 17:44:06 -04:00
PassParameters - > ClipmapVolumeWorldToUVAddAndMul = ClipmapVolumeWorldToUVAddAndMul ;
PassParameters - > PageTableClipmapOffsetZ = ClipmapIndex * PageGridResolution . Z ;
2020-07-06 18:58:26 -04:00
2020-09-08 17:44:06 -04:00
FComposeObjectsIntoPagesCS : : FPermutationDomain PermutationVector ;
PermutationVector . Set < FComposeObjectsIntoPagesCS : : FComposeParentDistanceField > ( ParentPageTableLayerTexture ! = nullptr ) ;
2022-02-02 05:42:31 -05:00
PermutationVector . Set < FComposeObjectsIntoPagesCS : : FProcessDistanceFields > ( Scene - > DistanceFieldSceneData . NumObjectsInBuffer > 0 ) ;
2022-03-01 21:07:45 -05:00
PermutationVector . Set < FComposeObjectsIntoPagesCS : : FCompositeCoverageAtlas > ( CoverageAtlasTexture ! = nullptr ) ;
2022-04-25 07:32:32 -04:00
extern int32 GDistanceFieldOffsetDataStructure ;
PermutationVector . Set < FComposeObjectsIntoPagesCS : : FOffsetDataStructure > ( GDistanceFieldOffsetDataStructure ) ;
2020-09-08 17:44:06 -04:00
auto ComputeShader = View . ShaderMap - > GetShader < FComposeObjectsIntoPagesCS > ( PermutationVector ) ;
2020-07-06 18:58:26 -04:00
FComputeShaderUtils : : AddPass (
GraphBuilder ,
2020-09-08 17:44:06 -04:00
RDG_EVENT_NAME ( " ComposeObjectsIntoPages " ) ,
2020-07-06 18:58:26 -04:00
ComputeShader ,
PassParameters ,
2020-09-08 17:44:06 -04:00
PageComposeIndirectArgBuffer ,
2020-07-06 18:58:26 -04:00
0 ) ;
2015-05-11 20:04:15 -04:00
}
2020-09-08 17:44:06 -04:00
// Compose heightfields into global SDF pages
2022-01-21 23:23:16 -05:00
if ( GAOGlobalDistanceFieldHeightfield ! = 0 & & UpdateRegionHeightfield . ComponentDescriptions . Num ( ) > 0 )
2015-05-11 20:04:15 -04:00
{
2020-09-08 17:44:06 -04:00
RDG_EVENT_SCOPE ( GraphBuilder , " ComposeHeightfieldsIntoPages " ) ;
2018-09-11 14:44:10 -04:00
2020-09-08 17:44:06 -04:00
const FVector PageVoxelExtent = 0.5f * ClipmapSize / FVector ( ClipmapResolution ) ;
const FVector PageCoordToVoxelCenterScale = ClipmapSize / FVector ( ClipmapResolution ) ;
const FVector PageCoordToVoxelCenterBias = Clipmap . Bounds . Min + PageVoxelExtent ;
for ( TMap < FHeightfieldComponentTextures , TArray < FHeightfieldComponentDescription > > : : TConstIterator It ( UpdateRegionHeightfield . ComponentDescriptions ) ; It ; + + It )
2018-09-11 14:44:10 -04:00
{
2020-09-08 17:44:06 -04:00
const TArray < FHeightfieldComponentDescription > & HeightfieldDescriptions = It . Value ( ) ;
if ( HeightfieldDescriptions . Num ( ) > 0 )
{
2022-02-08 14:04:52 -05:00
FRDGBufferRef HeightfieldDescriptionBuffer = UploadHeightfieldDescriptions ( GraphBuilder , HeightfieldDescriptions ) ;
2020-09-08 17:44:06 -04:00
UTexture2D * HeightfieldTexture = It . Key ( ) . HeightAndNormal ;
UTexture2D * VisibilityTexture = It . Key ( ) . Visibility ;
FComposeHeightfieldsIntoPagesCS : : FParameters * PassParameters = GraphBuilder . AllocParameters < FComposeHeightfieldsIntoPagesCS : : FParameters > ( ) ;
PassParameters - > View = View . ViewUniformBuffer ;
PassParameters - > RWPageAtlasTexture = GraphBuilder . CreateUAV ( PageAtlasTexture ) ;
2022-03-01 21:07:45 -05:00
PassParameters - > RWCoverageAtlasTexture = CoverageAtlasTexture ? GraphBuilder . CreateUAV ( CoverageAtlasTexture ) : nullptr ;
2020-09-08 17:44:06 -04:00
PassParameters - > ComposeIndirectArgBuffer = PageComposeHeightfieldIndirectArgBuffer ;
PassParameters - > ComposeTileBuffer = GraphBuilder . CreateSRV ( PageComposeHeightfieldTileBuffer , PF_R32_UINT ) ;
PassParameters - > PageTableLayerTexture = PageTableLayerTexture ;
PassParameters - > InfluenceRadius = ClipmapInfluenceRadius ;
2022-02-02 07:59:31 -05:00
PassParameters - > PageCoordToVoxelCenterScale = ( FVector3f ) PageCoordToVoxelCenterScale ;
PassParameters - > PageCoordToVoxelCenterBias = ( FVector3f ) PageCoordToVoxelCenterBias ;
2020-09-08 17:44:06 -04:00
PassParameters - > ClipmapVoxelExtent = ClipmapVoxelExtent . X ;
PassParameters - > PageGridResolution = PageGridResolution ;
2022-02-02 07:59:31 -05:00
PassParameters - > InvPageGridResolution = FVector3f : : OneVector / ( FVector3f ) PageGridResolution ;
PassParameters - > PageCoordToPageWorldCenterScale = ( FVector3f ) PageGridCoordToWorldCenterScale ;
PassParameters - > PageCoordToPageWorldCenterBias = ( FVector3f ) PageGridCoordToWorldCenterBias ;
2020-09-08 17:44:06 -04:00
PassParameters - > ClipmapVolumeWorldToUVAddAndMul = ClipmapVolumeWorldToUVAddAndMul ;
PassParameters - > PageTableClipmapOffsetZ = ClipmapIndex * PageGridResolution . Z ;
PassParameters - > NumHeightfields = HeightfieldDescriptions . Num ( ) ;
PassParameters - > InfluenceRadius = ClipmapInfluenceRadius ;
PassParameters - > HeightfieldThickness = ClipmapVoxelSize . X * GGlobalDistanceFieldHeightFieldThicknessScale ;
2021-05-14 07:17:32 -04:00
PassParameters - > HeightfieldTexture = HeightfieldTexture - > GetResource ( ) - > TextureRHI ;
2020-09-08 17:44:06 -04:00
PassParameters - > HeightfieldSampler = TStaticSamplerState < SF_Bilinear > : : GetRHI ( ) ;
2021-05-14 07:17:32 -04:00
PassParameters - > VisibilityTexture = VisibilityTexture ? VisibilityTexture - > GetResource ( ) - > TextureRHI : GBlackTexture - > TextureRHI ;
2020-09-08 17:44:06 -04:00
PassParameters - > VisibilitySampler = TStaticSamplerState < SF_Bilinear > : : GetRHI ( ) ;
PassParameters - > HeightfieldDescriptions = GraphBuilder . CreateSRV ( HeightfieldDescriptionBuffer , EPixelFormat : : PF_A32B32G32R32F ) ;
2022-03-01 21:07:45 -05:00
FComposeHeightfieldsIntoPagesCS : : FPermutationDomain PermutationVector ;
PermutationVector . Set < FComposeHeightfieldsIntoPagesCS : : FCompositeCoverageAtlas > ( CoverageAtlasTexture ! = nullptr ) ;
auto ComputeShader = View . ShaderMap - > GetShader < FComposeHeightfieldsIntoPagesCS > ( PermutationVector ) ;
2020-09-08 17:44:06 -04:00
FComputeShaderUtils : : AddPass (
GraphBuilder ,
RDG_EVENT_NAME ( " ComposeHeightfield " ) ,
ComputeShader ,
PassParameters ,
PageComposeHeightfieldIndirectArgBuffer ,
0 ) ;
}
2018-09-11 14:44:10 -04:00
}
2015-05-11 20:04:15 -04:00
}
2020-09-15 11:03:59 -04:00
if ( MipTexture & & CacheType = = GDF_Full )
{
RDG_EVENT_SCOPE ( GraphBuilder , " Coarse Clipmap " ) ;
2021-02-04 15:30:42 -04:00
const int32 ClipmapMipResolution = GlobalDistanceField : : GetClipmapMipResolution ( bLumenEnabled ) ;
2020-09-15 11:03:59 -04:00
// Propagate distance field
const int32 NumPropagationSteps = 5 ;
for ( int32 StepIndex = 0 ; StepIndex < NumPropagationSteps ; + + StepIndex )
{
FRDGTextureRef PrevTexture = TempMipTexture ;
FRDGTextureRef NextTexture = MipTexture ;
uint32 PrevClipmapOffsetZ = 0 ;
uint32 NextClipmapOffsetZ = ClipmapIndex * ClipmapMipResolution ;
2022-02-08 16:52:30 -05:00
if ( StepIndex % 2 = = NumPropagationSteps % 2 )
2020-09-15 11:03:59 -04:00
{
Swap ( PrevTexture , NextTexture ) ;
Swap ( PrevClipmapOffsetZ , NextClipmapOffsetZ ) ;
}
FPropagateMipDistanceCS : : FParameters * PassParameters = GraphBuilder . AllocParameters < FPropagateMipDistanceCS : : FParameters > ( ) ;
PassParameters - > View = View . ViewUniformBuffer ;
PassParameters - > RWMipTexture = GraphBuilder . CreateUAV ( NextTexture ) ;
2022-02-08 16:53:35 -05:00
PassParameters - > PageTableTexture = GAOGlobalDistanceFieldCacheMostlyStaticSeparately ? PageTableCombinedTexture : PageTableLayerTexture ;
2020-09-15 11:03:59 -04:00
PassParameters - > PageAtlasTexture = PageAtlasTexture ;
2022-02-02 07:59:31 -05:00
PassParameters - > GlobalDistanceFieldInvPageAtlasSize = FVector3f : : OneVector / FVector3f ( GlobalDistanceField : : GetPageAtlasSize ( bLumenEnabled , View . FinalPostProcessSettings . LumenSceneViewDistance ) ) ;
2022-01-26 17:07:27 -05:00
PassParameters - > GlobalDistanceFieldClipmapSizeInPages = GlobalDistanceField : : GetPageTableTextureResolution ( bLumenEnabled , View . FinalPostProcessSettings . LumenSceneViewDistance ) . X ;
2020-09-15 11:03:59 -04:00
PassParameters - > PrevMipTexture = PrevTexture ;
PassParameters - > ClipmapMipResolution = ClipmapMipResolution ;
2022-02-24 20:40:01 -05:00
PassParameters - > OneOverClipmapMipResolution = 1.0f / ClipmapMipResolution ;
2020-09-15 11:03:59 -04:00
PassParameters - > ClipmapIndex = ClipmapIndex ;
PassParameters - > PrevClipmapOffsetZ = PrevClipmapOffsetZ ;
PassParameters - > ClipmapOffsetZ = NextClipmapOffsetZ ;
2022-02-02 07:59:31 -05:00
PassParameters - > ClipmapUVScrollOffset = ( FVector3f ) Clipmap . ScrollOffset / ( FVector3f ) ClipmapSize ;
2020-09-15 11:03:59 -04:00
PassParameters - > CoarseDistanceFieldValueScale = 1.0f / GlobalDistanceField : : GetMipFactor ( ) ;
PassParameters - > CoarseDistanceFieldValueBias = 0.5f - 0.5f / GlobalDistanceField : : GetMipFactor ( ) ;
FPropagateMipDistanceCS : : FPermutationDomain PermutationVector ;
PermutationVector . Set < FPropagateMipDistanceCS : : FReadPages > ( StepIndex = = 0 ) ;
auto ComputeShader = View . ShaderMap - > GetShader < FPropagateMipDistanceCS > ( PermutationVector ) ;
FIntVector GroupSize = FComputeShaderUtils : : GetGroupCount ( FIntVector ( ClipmapMipResolution ) , FPropagateMipDistanceCS : : GetGroupSize ( ) ) ;
FComputeShaderUtils : : AddPass (
GraphBuilder ,
RDG_EVENT_NAME ( " Propagate step %d " , StepIndex ) ,
ComputeShader ,
PassParameters ,
GroupSize ) ;
}
}
2015-05-11 20:04:15 -04:00
}
2018-09-11 14:44:10 -04:00
}
2020-07-06 18:58:26 -04:00
}
2018-09-11 14:44:10 -04:00
2022-04-25 13:00:12 -04:00
FRDGExternalAccessQueue ExternalAccessQueue ;
2021-04-06 11:45:09 -04:00
2020-07-06 18:58:26 -04:00
for ( int32 CacheType = StartCacheType ; CacheType < GDF_Num ; CacheType + + )
{
2020-09-08 17:44:06 -04:00
if ( PageTableLayerTextures [ CacheType ] )
2018-09-11 14:44:10 -04:00
{
2022-04-25 13:00:12 -04:00
GlobalDistanceFieldInfo . PageTableLayerTextures [ CacheType ] = ConvertToExternalAccessTexture ( GraphBuilder , ExternalAccessQueue , PageTableLayerTextures [ CacheType ] ) ;
2015-05-11 20:04:15 -04:00
}
}
2020-09-08 17:44:06 -04:00
if ( PageFreeListAllocatorBuffer )
{
2022-04-25 13:00:12 -04:00
GlobalDistanceFieldInfo . PageFreeListAllocatorBuffer = ConvertToExternalAccessBuffer ( GraphBuilder , ExternalAccessQueue , PageFreeListAllocatorBuffer ) ;
2020-09-08 17:44:06 -04:00
}
if ( PageFreeListBuffer )
{
2022-04-25 13:00:12 -04:00
GlobalDistanceFieldInfo . PageFreeListBuffer = ConvertToExternalAccessBuffer ( GraphBuilder , ExternalAccessQueue , PageFreeListBuffer ) ;
2020-09-08 17:44:06 -04:00
}
if ( PageAtlasTexture )
{
2022-04-25 13:00:12 -04:00
GlobalDistanceFieldInfo . PageAtlasTexture = ConvertToExternalAccessTexture ( GraphBuilder , ExternalAccessQueue , PageAtlasTexture ) ;
2020-09-08 17:44:06 -04:00
}
2022-03-01 21:07:45 -05:00
if ( CoverageAtlasTexture )
{
2022-04-25 13:00:12 -04:00
GlobalDistanceFieldInfo . CoverageAtlasTexture = ConvertToExternalAccessTexture ( GraphBuilder , ExternalAccessQueue , CoverageAtlasTexture ) ;
2022-03-01 21:07:45 -05:00
}
2020-09-08 17:44:06 -04:00
if ( PageTableCombinedTexture )
{
2022-04-25 13:00:12 -04:00
GlobalDistanceFieldInfo . PageTableCombinedTexture = ConvertToExternalAccessTexture ( GraphBuilder , ExternalAccessQueue , PageTableCombinedTexture ) ;
2020-09-08 17:44:06 -04:00
}
2020-09-15 11:03:59 -04:00
if ( MipTexture )
{
2022-04-25 13:00:12 -04:00
GlobalDistanceFieldInfo . MipTexture = ConvertToExternalAccessTexture ( GraphBuilder , ExternalAccessQueue , MipTexture ) ;
2020-09-15 11:03:59 -04:00
}
2021-04-06 11:45:09 -04:00
2022-04-25 13:00:12 -04:00
ExternalAccessQueue . Submit ( GraphBuilder ) ;
2015-05-11 20:04:15 -04:00
}
2022-04-22 19:55:41 -04:00
2022-04-26 14:37:07 -04:00
if ( CVarGlobalDistanceFieldDebug . GetValueOnRenderThread ( ) ! = 0 & & GlobalDistanceFieldInfo . PageFreeListAllocatorBuffer )
2022-04-22 19:55:41 -04:00
{
2022-04-26 14:37:07 -04:00
FRDGBufferRef PageFreeListAllocatorBuffer = GraphBuilder . RegisterExternalBuffer ( GlobalDistanceFieldInfo . PageFreeListAllocatorBuffer ) ;
2022-04-22 19:55:41 -04:00
FGlobalDistanceFieldDebugCS : : FParameters * PassParameters = GraphBuilder . AllocParameters < FGlobalDistanceFieldDebugCS : : FParameters > ( ) ;
ShaderPrint : : SetParameters ( GraphBuilder , View , PassParameters - > ShaderPrintUniformBuffer ) ;
PassParameters - > GlobalDistanceFieldPageFreeListAllocatorBuffer = GraphBuilder . CreateSRV ( PageFreeListAllocatorBuffer , PF_R32_UINT ) ;
PassParameters - > GlobalDistanceFieldMaxPageNum = View . GlobalDistanceFieldInfo . ParameterData . MaxPageNum ;
auto ComputeShader = View . ShaderMap - > GetShader < FGlobalDistanceFieldDebugCS > ( ) ;
FComputeShaderUtils : : AddPass (
GraphBuilder ,
RDG_EVENT_NAME ( " GlobalDistanceFieldDebug " ) ,
ComputeShader ,
PassParameters ,
FIntVector ( 1 , 1 , 1 ) ) ;
}
2020-07-06 18:58:26 -04:00
}
if ( GDFReadbackRequest & & GlobalDistanceFieldInfo . Clipmaps . Num ( ) > 0 )
{
// Read back a clipmap
2020-10-27 13:40:36 -04:00
ReadbackDistanceFieldClipmap ( GraphBuilder . RHICmdList , GlobalDistanceFieldInfo ) ;
2015-05-11 20:04:15 -04:00
}
2020-02-06 17:56:50 -05:00
if ( GDFReadbackRequest & & GlobalDistanceFieldInfo . Clipmaps . Num ( ) > 0 )
{
// Read back a clipmap
2020-10-27 13:40:36 -04:00
ReadbackDistanceFieldClipmap ( GraphBuilder . RHICmdList , GlobalDistanceFieldInfo ) ;
2020-02-06 17:56:50 -05:00
}
2015-05-11 20:04:15 -04:00
}
2021-06-16 17:48:21 -04:00
void GlobalDistanceField : : ExpandDistanceFieldUpdateTrackingBounds ( const FSceneViewState * ViewState , DistanceField : : FUpdateTrackingBounds & UpdateTrackingBounds )
{
// Global Distance Field is interested in any updates which in ClipmapInfluenceBounds range of it's clipmaps
2022-04-22 19:55:41 -04:00
const int32 NumClipmaps = FMath : : Clamp < int32 > ( GetNumGlobalDistanceFieldClipmaps ( false , 1.0f ) , 0 , GlobalDistanceField : : MaxClipmaps ) ;
2021-06-16 17:48:21 -04:00
for ( int32 ClipmapIndex = 0 ; ClipmapIndex < NumClipmaps ; ClipmapIndex + + )
{
const FGlobalDistanceFieldClipmapState & ClipmapViewState = ViewState - > GlobalDistanceFieldClipmapState [ ClipmapIndex ] ;
const FVector3f ClipmapCenter = ClipmapViewState . CachedClipmapCenter ;
const float ClipmapExtent = ClipmapViewState . CachedClipmapExtent + ClipmapViewState . CacheClipmapInfluenceRadius ;
const FBox ClipmapInfluenceBounds ( ClipmapCenter - ClipmapExtent , ClipmapCenter + ClipmapExtent ) ;
UpdateTrackingBounds . GlobalDistanceFieldBounds + = ClipmapInfluenceBounds ;
}
2022-03-30 14:37:45 -04:00
}