Files
UnrealEngineUWP/Engine/Source/Runtime/Renderer/Private/DistanceFieldVisualization.cpp

282 lines
12 KiB
C++
Raw Normal View History

Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304) #lockdown Nick.Penwarden #rb none ========================== MAJOR FEATURES + CHANGES ========================== Change 3250856 on 2017/01/09 by Daniel.Wright Only showing instruction count for 'Base pass shader' now Change 3250943 on 2017/01/09 by Rolando.Caloca DR - Async Compute PSO creation Change 3251036 on 2017/01/09 by Rolando.Caloca DR - Add r.AsyncPipelineCompile - Dispatch on any thread - Wait for completion event Change 3251058 on 2017/01/09 by Ben.Woodhouse Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN) #jira UE-40332 Change 3251141 on 2017/01/09 by Ben.Woodhouse Duplicated from Fortnite CL 3243458: D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory. On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage. Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX Change 3251142 on 2017/01/09 by Ben.Woodhouse Duplicated from Fortnite 3243496 memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time. Change 3252323 on 2017/01/10 by Rolando.Caloca DR - Gfx async PSO creation prep Change 3252474 on 2017/01/10 by Daniel.Wright Added 'Compile Unreal Lightmass' to error message Change 3252589 on 2017/01/10 by Daniel.Wright Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite Change 3252790 on 2017/01/10 by Daniel.Wright Added InscatteringColorCubemapAngle to exponential height fog Change 3252843 on 2017/01/10 by Uriel.Doyon Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways. The bound defrag is now done outside of the async work logic. Change 3252866 on 2017/01/10 by Mark.Satterthwaite Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future. #jira UE-40357 Change 3254511 on 2017/01/11 by Rolando.Caloca DR - PSO stats Change 3255958 on 2017/01/12 by Mark.Satterthwaite Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed. #jira UE-40554 Change 3256329 on 2017/01/12 by Olaf.Piesche #jira UE-38615 Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime. Change 3256371 on 2017/01/12 by Uriel.Doyon Reenabled texture streaming bound defrag as the fix is in CL 3252843 Change 3257032 on 2017/01/13 by Daniel.Wright Added fastClamp to fastmath.usf Change 3257111 on 2017/01/13 by Daniel.Wright Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game Change 3257112 on 2017/01/13 by Daniel.Wright DFAO optimizations * Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms) * Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms) * Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms * Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms) Change 3257113 on 2017/01/13 by Daniel.Wright Better distance field memory stats Change 3257326 on 2017/01/13 by Uriel.Doyon Workaround to support cases where several textures have the same lighting GUID. Change 3257448 on 2017/01/13 by Daniel.Wright Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles Change 3257616 on 2017/01/13 by Daniel.Wright Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable Change 3257657 on 2017/01/13 by Daniel.Wright Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU * 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms Change 3258063 on 2017/01/14 by Rolando.Caloca DR - vk - Refactor descriptor set reuse in prep for more changes Change 3258715 on 2017/01/16 by Daniel.Wright Added VisualizeGlobalDistanceField show flag Change 3258827 on 2017/01/16 by Daniel.Wright Global distance field update regions are clipped against others to reduce redundant updates. Change 3258959 on 2017/01/16 by Benjamin.Hyder Updating Planar Reflection example material in TM-Shadermodels Change 3259270 on 2017/01/16 by Daniel.Wright [Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons. Change 3259652 on 2017/01/16 by Uriel.Doyon Better support for static primitive becoming dynamic. Change 3260107 on 2017/01/17 by Ben.Woodhouse Fix FMonitoredProcess to prevent infinite loop in -nothreading mode #jira UE-40717 Change 3260594 on 2017/01/17 by Daniel.Wright Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary) * The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field * Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same. * Adds 12Mb for the new volume textures Change 3260956 on 2017/01/17 by Daniel.Wright Structured buffers for DF object data * Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads Change 3261296 on 2017/01/17 by Daniel.Wright Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms Change 3262036 on 2017/01/18 by Ben.Salem V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details. Change 3262056 on 2017/01/18 by Chris.Bunner Remove inverse tonemapping when rendering HDR output. #jira UE-40728 Change 3262661 on 2017/01/18 by Rolando.Caloca DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs - Fix hash for PSOs Change 3263674 on 2017/01/19 by Chris.Bunner PR #3144: Improved error messages (Contributed by DarkSlot) #jira UE-40835 Change 3264150 on 2017/01/19 by Ben.Woodhouse Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode #jira UE-40841 Change 3264153 on 2017/01/19 by Ben.Woodhouse Integrate latest changes from MS-DX12 CLs 3231395-3262526 - Added WinPixEventRuntime.tps - Includes PIX support, various optimizations (saved 1.3ms in testbed scene) CL 3262343: Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes: - Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case. CL 3231395: Update D3D12 RHI: - Fix deferred MSAA path in RHI - Add Pix3.h support - Cleanup SetName usage and remove it from shipping builds. - Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value. - Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64 - Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version. - Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident. Change 3264251 on 2017/01/19 by Mark.Satterthwaite Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions. #jira UE-40803 Change 3264642 on 2017/01/19 by Daniel.Wright Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096. Change 3265330 on 2017/01/20 by Ben.Salem Stop performance plugin from building in Win32. #tests recompiled and preflighted Change 3265678 on 2017/01/20 by Marcus.Wassmer Fix bad declaration. #3055 Change 3266656 on 2017/01/20 by Mark.Satterthwaite Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging). Duplicate & amend CL #3266053 from Trepka: Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled. Amend & integrate RCO's CL #3197085. Change 3267741 on 2017/01/23 by Rolando.Caloca DR - Detect duplicated shader and pipeline types Change 3268600 on 2017/01/23 by Uriel.Doyon Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings. Integrated CL 3227368 from Orion stream Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months. Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing, Added th MaxEffectiveScreenSize settings in the investigate texture command. Change 3269512 on 2017/01/24 by Richard.Wallis Fix for shader binary cache uncompress data size during internal shader log. Change 3271237 on 2017/01/25 by Ben.Woodhouse D3D12 updateTexture2D crash fix #jira UE-41059 Change 3271564 on 2017/01/25 by Olaf.Piesche #jira UE-40980 #udn 325525 Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe Change 3271594 on 2017/01/25 by Ben.Woodhouse ESRAM support stage 1: Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range. Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR #fyi rolando.caloca Change 3272616 on 2017/01/25 by Rolando.Caloca DR - Update shader version Change 3273138 on 2017/01/26 by Ben.Woodhouse Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main) [CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
// Copyright 1998-2017 Epic Games, Inc. All Rights Reserved.
/*=============================================================================
DistanceFieldVisualization.cpp
=============================================================================*/
#include "DistanceFieldAmbientOcclusion.h"
#include "DeferredShadingRenderer.h"
#include "PostProcess/PostProcessing.h"
#include "PostProcess/SceneFilterRendering.h"
#include "DistanceFieldLightingShared.h"
#include "ScreenRendering.h"
#include "DistanceFieldLightingPost.h"
#include "OneColorShader.h"
#include "GlobalDistanceField.h"
#include "FXSystem.h"
#include "PostProcess/PostProcessSubsurface.h"
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3357411) #lockdown Nick.Penwarden #rb none ========================== MAJOR FEATURES + CHANGES ========================== Change 3244756 on 2017/01/03 by Marcus.Wassmer Copying //Tasks/UE4/Dev-Niagara@3244743 to Dev-Rendering (//UE4/Dev-Rendering) Change 3248667 on 2017/01/05 by Olaf.Piesche Resaving default asset because of engine verison issue; maybe unnecessary, but resaving niagara engine content to be sure #jira UE-40160 Change 3249324 on 2017/01/06 by Marcus.Wassmer Resave with an actual version to stop cook warning Change 3249611 on 2017/01/06 by Marcus.Wassmer Just remove warning-causing niagara data for now. Change 3308052 on 2017/02/16 by Rolando.Caloca DR - Check for Vulkan SDK, and only use it if it's newer or the same as the headers we distribute Change 3308109 on 2017/02/16 by Rolando.Caloca DR - Upgrade glslang to 1.0.39.1 Change 3308111 on 2017/02/16 by Rolando.Caloca DR - Update Vulkan distribution to 1.0.39.1 Change 3308153 on 2017/02/16 by Rolando.Caloca DR - Updated glslang libs Change 3308842 on 2017/02/17 by Rolando.Caloca DR - Fixed copy/paste Change 3310007 on 2017/02/17 by Chris.Bunner Back out CL 3221219 - causing MIC generation issues and superseded by CL 3273971. #jira UE-37792 Change 3310154 on 2017/02/17 by Chris.Bunner Assert when attempting to add a custom material attribute already in the base attributes list. Change 3310155 on 2017/02/17 by Chris.Bunner PR #3231: Validate material index before accessing (Contributed by projectgheist) #jira UE-41774, UE-41788 Change 3310162 on 2017/02/17 by Chris.Bunner PR #3252: Added MobileMaterialInterface to UsedMaterials (Contributed by projectgheist) #jira UE-41823, UE-41950 Change 3310176 on 2017/02/17 by Chris.Bunner Merging CL 3233886: AMD HDR support (requires r.AMDSupportsHDRDisplayOutput=1 in ini). Update to AGS 5.0.5. Partial code tidy up. Change 3310187 on 2017/02/17 by Chris.Bunner Preserve constant expressions rather than always casting after translating a material attribute. Losing the notion of constant means we can't correctly detect used properties and falsely enable e.g. PDO. Happened because of the incorrect component masks in BreakMaterialNodes which then had to be downcast to the correct type which is done as an inline fragment rather than swizzle expression. #jira UE-41594 Change 3310215 on 2017/02/17 by Chris.Bunner Prevent SpeedTree node compiling for skeletal meshes (not supported as uses more UV sets than available). More descriptive error for missing Cubemap UV input on TextureSample material node . #jira UE-33098 Change 3310838 on 2017/02/18 by Joe.Graf Moved some private functions to public for a licensee #CodeReview: matt.kuhlenschmidt #rb: n/a Change 3311876 on 2017/02/20 by Rolando.Caloca DR - Expose skin cache cvar r.SkinCache.AccumulationBufferSizeInKB #jira UE-42014 Change 3314139 on 2017/02/21 by Rolando.Caloca DR - Minor cleanup pass - Remove FVulkanPendingState - Renamed some classes for clarity - Hoist pending UAVs for flush out to pending compute state Change 3314642 on 2017/02/21 by Rolando.Caloca DR - Some more renaming Change 3315431 on 2017/02/21 by Ben.Salem Properly set default values for test time out and tick. We now will default to ticking once per second, and tracking the macro stats of GPU/Render/Game thread time. #tests Ran showdown demo several times Change 3316710 on 2017/02/22 by Rolando.Caloca DR - hlslcc - Fix refract intrinsic Change 3316718 on 2017/02/22 by Rolando.Caloca DR - hlslcc - Built libs to pick up change from 3316710 - refract fix Change 3316820 on 2017/02/22 by Benjamin.Hyder updating Tm-TrigNodes map Change 3317192 on 2017/02/22 by Benjamin.Hyder Updating QA-Decals map Change 3317528 on 2017/02/22 by Benjamin.Hyder Updating QA-Decals map Change 3317639 on 2017/02/22 by Benjamin.Hyder Updating Decal on Complex Mesh example in QA-Decals Change 3317764 on 2017/02/22 by Benjamin.Hyder Final updates to QA-Decals Change 3318319 on 2017/02/22 by Rolando.Caloca DR - minor reorg/rename Change 3318379 on 2017/02/22 by Rolando.Caloca DR - more cleanup Change 3321181 on 2017/02/24 by Rolando.Caloca DR - Fix GL bug Change 3321247 on 2017/02/24 by Rolando.Caloca DR - Fix misc bugs Change 3321898 on 2017/02/24 by Chris.Bunner Only issue clear TLV dispatch if required. #jira UERNDR-193 Change 3321904 on 2017/02/24 by Chris.Bunner Added comment for potential future optimization. Change 3322013 on 2017/02/24 by Uriel.Doyon Fixed separate translucency being affected by Gaussian DOF #jira UE-40489 Change 3322517 on 2017/02/24 by Uriel.Doyon Fixed issue with InvestigateTexture command removing budget limit. Fixed StreamingBounds show flag not working. It nows shows the streaming bound for the currently selected textures. #jira UE-40485 Change 3323470 on 2017/02/27 by Chad.Garyet Removing DDC job from dev-rendering Change 3323479 on 2017/02/27 by Chad.Garyet Removing RDU agent type Change 3323519 on 2017/02/27 by Chad.Garyet removing NCL/LHR/SEA agent types to clean up space Change 3323639 on 2017/02/27 by Benjamin.Hyder More updates to QA-Decals Change 3324207 on 2017/02/27 by Uriel.Doyon Fixed typo ScaleTexturesByGlobalMyBias -> ScaleTexturesByGlobalMipBias Removed bad merge in FStreamingTextureLevelContext::GetBuildDataIndexRef Change 3324396 on 2017/02/27 by Uriel.Doyon Fixed an issue with the Streaming Bounds show flag interferring with the static level data initialization #jira UE-40485 Change 3325227 on 2017/02/28 by Chris.Bunner Fix-up AMD AGS libs. Change 3325566 on 2017/02/28 by Uriel.Doyon Fixed possible out-of-bound access in GetUsedTexture() when passing ERHIFeatureLevel::Num Change 3326009 on 2017/02/28 by Uriel.Doyon Better fix for 3325566, as the previous fix would ignore the material instance overrides. Change 3327058 on 2017/03/01 by Benjamin.Hyder Preparing TM_Shadermodels map for automation Change 3328222 on 2017/03/01 by Chris.Bunner Prevent decals from drawing in separate translucency pass. Whilst user control and material relevance were already removed, if the flag was checked before being disabled (by swapping to decal domain) this was still being read in the render loop, now explicitly ignores decals. #jira UE-42449, UE-42446 Change 3329848 on 2017/03/02 by Uriel.Doyon Added some extra logs to help track UE-42168 Change 3329977 on 2017/03/02 by Rolando.Caloca DR - Fix bad clear value Change 3330008 on 2017/03/02 by Benjamin.Hyder More preparations for QA-Decals automation Change 3330754 on 2017/03/02 by Daniel.Wright Prominent comment explaining reflection env async compute usage and why it's not overlapped with anything Change 3331451 on 2017/03/03 by Marc.Olano Manually unroll simplex noise loop to avoid PSO bug on AMD/Metal Change 3331839 on 2017/03/03 by Rolando.Caloca DR - hlslcc - add missing file to project Change 3332247 on 2017/03/03 by Rolando.Caloca DR - Fix for integrated intel PR #3305 #jira UE-42393 Change 3332259 on 2017/03/03 by Rolando.Caloca DR - Fix bad index into pixel formats PR #3237 #jira UE-41855 Change 3332305 on 2017/03/03 by Rolando.Caloca DR - OpenGL SRV for index buffers PR #3271 #jira UE-32618 Change 3332313 on 2017/03/03 by Rolando.Caloca DR - Fix for integrated intel (properly) PR #3305 #jira UE-42393 Change 3332317 on 2017/03/03 by Rolando.Caloca DR - OpenGL SRV for index buffers (properly) PR #3271 #jira UE-32618 Change 3332368 on 2017/03/03 by Rolando.Caloca DR - Minor fixes so -sm4 and -sm5 can be used on windows with OpenGL/Vulkan Change 3333690 on 2017/03/06 by Daniel.Wright [Copy] Changing movable skylight properties no longer affects static draw lists Change 3333693 on 2017/03/06 by Daniel.Wright [Copy] Added 'r.AOListMeshDistanceFields' which dumps out mesh distance fields sorted by memory size, useful for directing content optimizations Change 3333705 on 2017/03/06 by Daniel.Wright [Copy] Mesh distance fields are now 8 bit fixed point by default, but can be changed back to 16 bit floating piont with a project setting. * 8 bit uses half memory but introduces error for thin surfaces or large meshes. Change 3333721 on 2017/03/06 by David.Hill DecalProxy: Copy float FadeScreenSize to FDeferredDecalProxy for use in the render thread. This avoids pointer chasing to the UDecalComponent (game thread component). Change 3333772 on 2017/03/06 by Daniel.Wright [Copy] Scene motion blur data is only updated for the main renderer frames. Fixes scene captures and planar reflections breaking object motion blur. Change 3333790 on 2017/03/06 by Daniel.Wright [Copy] Mesh distance field generation uses Embree, for a 2.5x speedup * Can switch back to old kDOP generation with 'r.DistanceFieldBuild.UseEmbree 0' for debugging Change 3333822 on 2017/03/06 by Daniel.Wright [Copy] Moved mesh distance field code into MeshDistanceFieldUtilities.cpp * Moved FMeshUtilities to its own header so the 8k line MeshUtilites.cpp file can be further split up Change 3333827 on 2017/03/06 by Daniel.Wright [Copy] Range compress 8bit distance fields - gets one extra bit of precision on average Change 3333828 on 2017/03/06 by Daniel.Wright [Copy] Raised High ShadowQuality to 2048 as 1024 for CSM is way too low Change 3333831 on 2017/03/06 by Daniel.Wright Non-editor compile fix Change 3333836 on 2017/03/06 by Daniel.Wright [Copy] Workaround for gobal distance field volume textures being bloated by 4x on PS4 due to the recommended tiling modes. They now use a 2d tiling mode which avoids the bloat, saving 96Mb. Change 3333843 on 2017/03/06 by Daniel.Wright [Copy] Added OcclusionExponent to skylight component * Useful for brightening up indoors without losing contact shadows as MinOcclusion does Change 3333845 on 2017/03/06 by Daniel.Wright [Copy] Capsule shadow BP functions Change 3333850 on 2017/03/06 by Daniel.Wright [Copy] Added OcclusionCombineMode to skylight component Change 3333854 on 2017/03/06 by Daniel.Wright [Copy] Gnm properly registers clears as GPU work so those events show up in profilegpu Change 3333857 on 2017/03/06 by Daniel.Wright [Copy] Clear light attenuation for local lights with a quad covering their screen extents * Clearing the entire light attenuation buffer costs .1ms on PS4. This optimization lowers the minimum cost of a shadow casting light from .15ms -> .03ms. * Shadowed lights in Fortnite with 25 lights 3.7ms -> 1.42ms on PS4 Change 3333860 on 2017/03/06 by Daniel.Wright [Copy] Flush deferred deletes when reallocating distance field atlas to reduce peak memory Change 3333861 on 2017/03/06 by Daniel.Wright [Copy] Disable all distance field features on Intel cards as HD 4000 hangs in the RHICreateTexture3D call to allocate the large atlas Change 3333869 on 2017/03/06 by Daniel.Wright [Copy] Volumetric Fog using a volume texture mapped to the camera frustum * Volumetric fog can be enabled on an Exponential Height Fog component with additional controls * Lights have a VolumetricScatteringIntensity * New cvars r.VolumetricFog, r.VolumetricFog.GridPixelSize, r.VolumetricFog.GridSizeZ, r.VolumetricFog.DepthDistributionScale * Lighting features supported: * Directional light with CSM and a light function * Point / spot lights without shadows / light functions / IES profiles * Skylight with occlusion from distance fields * Analytical height fog covers the view range past where the volumetric fog ends * Temporal reprojection is used on the volumetric fog scattering and extinction to achieve stability * Translucency integrates properly into volumetric fog * Height fog StartDistance is not supported by volumetric fog and should be set to 0. Change 3333894 on 2017/03/06 by Daniel.Wright [Copy] Initialize GDummyVolumetricFogGlobalDataUniformBuffer outside of parallel rendering Change 3333902 on 2017/03/06 by Daniel.Wright [Copy] Better handling of volumetric fog enabled with distance of 0 Change 3333903 on 2017/03/06 by Daniel.Wright [Copy] Fixed volumetric fog trying to render light functions for a point light Change 3333908 on 2017/03/06 by Daniel.Wright [Copy] Volumetric materials * Added new material domain Volume, which can output Scattering, Absorption and Emissive. All properties are in world space densities. * Particle systems using the Volume domain are voxelized based on their ParticlePosition and ParticleRadius * Volumetric fog integration is now energy conservative - scattering is integrated against transmission over the depth of each slice. * Added bOverrideLightColorsWithFogInscatteringColors to exponential height fog, which can be enabled to make Volumetric Fog match Height fog more closely Change 3334134 on 2017/03/06 by Daniel.Wright [Copy from Michael Trepka] Added Embree 2.14.0 and changed MeshUtilities to use it as this solves issues with Embree leaking TLS keys. UnrealLightmass is still using older Embree 2.7.0 until we can find time to properly test it with the new version. Also, invalidated distance field DDC to force it to rebuild with updated Embree. Change 3334420 on 2017/03/06 by Daniel.Wright Fixed RTDF shadows Change 3335467 on 2017/03/07 by Benjamin.Hyder Initial submission of QA-Decals map to EngineTest Change 3335556 on 2017/03/07 by Daniel.Wright Changed mesh distance field default format back to R16f Change 3338020 on 2017/03/08 by Daniel.Wright Disable volumetric fog in vertex shaders for feature levels which don't support it Change 3339394 on 2017/03/09 by Chris.Bunner Correctly handle material texture translation error edge case. #jira UE-42579, UE-42670 Change 3339992 on 2017/03/09 by Daniel.Wright Only compile volumetric fog shaders on supporting platforms Change 3341858 on 2017/03/10 by Arne.Schober Copying //UE4/Dev-Rendering-PSO to Dev-Rendering (//UE4/Dev-Rendering) #RB Rolando.Caloca, Marcus.Wassmer, Daniel.Wright, Nick.Penwarden, Mark.Satterthwaite Change 3342004 on 2017/03/10 by Arne.Schober Copying //UE4/Dev-Rendering-PSO to Dev-Rendering (//UE4/Dev-Rendering) Fix unity build #RB Marcus.Wassmer Change 3343307 on 2017/03/13 by Marcus.Wassmer Update showflags when we are guaranteed it will happen in all possible ways to spawn the scenecapture. (drag into editor, PIE, -game, etc) Change 3343732 on 2017/03/13 by Rolando.Caloca DR - Vulkan compute pipeline & refactor Change 3344846 on 2017/03/14 by Rolando.Caloca DR - Android compile fixes Change 3344883 on 2017/03/14 by Rolando.Caloca DR - Add missing stencil load/store to PSO initializer Change 3344985 on 2017/03/14 by Rolando.Caloca DR - Made load/store actions uint8 Change 3345141 on 2017/03/14 by Rolando.Caloca DR - vk - Rework render pass hash Change 3345304 on 2017/03/14 by Benjamin.Hyder Updating TM-Distancefields map to include TemplateFloor mesh Change 3345387 on 2017/03/14 by Rolando.Caloca DR - Add _RenderThread calls for Create*Shader so RHIs can choose not to stall when creating Change 3345388 on 2017/03/14 by Rolando.Caloca DR - Do not stall when creating shaders on Vulkan Change 3345722 on 2017/03/14 by Chris.Bunner PR #3357: MinimalAPI add to many material expressions (Contributed by DeanoC) #jira UE-42752 Change 3345723 on 2017/03/14 by Chris.Bunner Reduce log verbosity causing spamming during landscape editing. #jira UE-42714 Change 3345725 on 2017/03/14 by Chris.Bunner [Duplicate 3341860] Fixed material translation error with multiple connections from custom interpolator nodes. Change 3345726 on 2017/03/14 by Chris.Bunner Typo fixes. Change 3345732 on 2017/03/14 by Rolando.Caloca DR - Decouple vertex declaration off BSS Change 3345746 on 2017/03/14 by Chris.Bunner Added sign() intrinsic material graph node and delisted material function workaround. Change 3346042 on 2017/03/14 by Chris.Bunner Implement missing size query interface for FRenderTargetResources. #jira UE-41672 Change 3346387 on 2017/03/14 by Daniel.Wright [Copy] Added VolumetricScatteringIntensity to particle lights Change 3346389 on 2017/03/14 by Daniel.Wright [Copy] Clamp Volumetric material attributes to fp16 range to avoid INFs Disable volumetric fog when the fog show flag is disabled Change 3346392 on 2017/03/14 by Daniel.Wright [Copy] Fixed skylight being much too bright on volumetric fog Change 3346406 on 2017/03/14 by Daniel.Wright [Copy] CSM resolution is now controlled by r.Shadow.MaxCSMResolution. * Changed HighPC to use 1024 MaxShadowResolution (max for all non-CSM shadows), saves 60Mb in Fortnite Change 3346412 on 2017/03/14 by Daniel.Wright [Copy] TexCreate_ReduceMemoryWithTilingMode for translucency lighting 3d textures, saves 13Mb Change 3346414 on 2017/03/14 by Daniel.Wright [Copy] TexCreate_ReduceMemoryWithTilingMode for volumetric fog 3d textures, saves 13Mb Change 3346415 on 2017/03/14 by Daniel.Wright [Copy] Missing file from cl 3338451 Change 3346421 on 2017/03/14 by Daniel.Wright [Copy] Fixed NaNs in volumetric fog due to rendering when height fog is disabled * Volumetric fog converts NaNs to black now so they don't spread Change 3346422 on 2017/03/14 by Daniel.Wright [Copy] Fixed NaN in volumetric fog with low density values Change 3346423 on 2017/03/14 by Daniel.Wright [Copy] Changed default VolumetricFogScatteringDistribution to .2 Change 3346430 on 2017/03/14 by Daniel.Wright [Copy] New translucent material option to compute fog per pixel instead of the default per vertex Change 3346432 on 2017/03/14 by Daniel.Wright [Copy] Moved Volumetric Fog parameters to view uniform buffer for translucency pass Fixed lifetimes of temporary Volumetric Fog render targets Change 3346526 on 2017/03/14 by Daniel.Wright [Copy] Volumetric Fog supports point and spot light shadows * These lights are injected separately so that per-light resources can be bound (shadow depth map, static shadow depth map) * Forward lighting of local lights can be forced with 'r.VolumetricFog.InjectShadowedLightsSeparately 0' * Shadowed lights come at a cost: 2.9ms for volumetric fog on 970 -> 4.2ms with shadowing Change 3347053 on 2017/03/15 by Rolando.Caloca DR - android compile fix Change 3347384 on 2017/03/15 by Rolando.Caloca DR - Fix merge issue Change 3347643 on 2017/03/15 by Marcus.Wassmer Fix some bugs with the 'disable stationary skylight ffor the project' feature. Fixes lighting in Persona on Paragon. Change 3347979 on 2017/03/15 by Rolando.Caloca DR - Allow to automatically apply cached rendertargets to PSO initializer Change 3348024 on 2017/03/15 by Rolando.Caloca DR - Remove NullPS on Vulkan to avoid deadlock Change 3348303 on 2017/03/15 by Rolando.Caloca DR - Fix for debugging SCW with material SRT Change 3348357 on 2017/03/15 by Marcus.Wassmer Fix stencildither and a stencilref bug that was probably breaking decals sometimes. Change 3348549 on 2017/03/15 by Marcus.Wassmer Hopefully fix static analysis for potential nullptr access. Change 3348614 on 2017/03/15 by Marcus.Wassmer Duplicate some switch changes to fix crash on launch. Change 3349369 on 2017/03/16 by Gil.Gribb Fixed botched merge Change 3349947 on 2017/03/16 by Rolando.Caloca DR - Fix for mismatched primitive type Change 3349956 on 2017/03/16 by Benjamin.Hyder initial updates to TM-DistanceFields map Change 3350151 on 2017/03/16 by Rolando.Caloca DR - Fix UT compile issue Change 3350155 on 2017/03/16 by Rolando.Caloca DR - Catch mismatched primitive type on PSOs on D3D11 Change 3350192 on 2017/03/16 by Daniel.Wright Fix for point light shadow depths rendering with wrong cull mode due to PSO refactor Change 3350736 on 2017/03/16 by Daniel.Wright Fixed formatting from merge Change 3350881 on 2017/03/16 by Rolando.Caloca DR - Fix texture arrays as UAVs on Metal Change 3350927 on 2017/03/16 by Rolando.Caloca DR - Fix warning Change 3350935 on 2017/03/16 by Daniel.Wright Fix for materials with non-Surface domains being skipped in mesh passes Change 3351583 on 2017/03/17 by Marcus.Wassmer Fix clang platforms Change 3351917 on 2017/03/17 by Marcus.Wassmer Fix linux compile Change 3351973 on 2017/03/17 by Marcus.Wassmer Fix mismatched rendertargetformat Change 3352038 on 2017/03/17 by Daniel.Wright Enabled GetAndOrCreateGraphicsPipelineState ensures in Development for testing Change 3352110 on 2017/03/17 by Marcus.Wassmer Fix missing RT PSO apply Change 3352695 on 2017/03/17 by Arne.Schober DR - Remove PSO Rendertarget check in DX12 Resolve with Shader. #RB Rolando.Caloca Change 3352960 on 2017/03/17 by Arne.Schober DR - Fix some things that slipped trough the PSO merge #RB none Change 3353150 on 2017/03/18 by Rolando.Caloca DR - compile fix Change 3353205 on 2017/03/18 by Arne.Schober DR - Fix Incremental Compile and PS4 runtime error where CMASK is not allowed for ThickTile Mode #RB none Change 3353207 on 2017/03/18 by Arne.Schober DR - Fix Confusion #RB none Change 3355183 on 2017/03/20 by Nick.Bullard Fixed up Content orginzation for Decals automation tests in EngineTest Change 3355627 on 2017/03/20 by Arne.Schober DR - [UE-43094] - removed ensure in comporiton graph as control of the clear color cannot be gurantueed. Change 3356342 on 2017/03/21 by Marcus.Wassmer Fix clang errors Change 3356591 on 2017/03/21 by Arne.Schober DR - Fix ensure message #RB none Change 3356873 on 2017/03/21 by Arne.Schober DR - Fix comparission of undefined values in RendertargetApply Check Change 3357261 on 2017/03/21 by Marcus.Wassmer Fix LinuxEditor compile Change 3357294 on 2017/03/21 by Marcus.Wassmer Add missing SSE functions Change 3357351 on 2017/03/21 by Frank.Fella Fix win32 and linux compiler errors Change 3357370 on 2017/03/21 by Arne.Schober DR - disable ensure in test builds #RB Marcus.Wassmer [CL 3357449 by Marcus Wassmer in Main branch]
2017-03-21 17:46:52 -04:00
#include "PipelineStateCache.h"
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304) #lockdown Nick.Penwarden #rb none ========================== MAJOR FEATURES + CHANGES ========================== Change 3250856 on 2017/01/09 by Daniel.Wright Only showing instruction count for 'Base pass shader' now Change 3250943 on 2017/01/09 by Rolando.Caloca DR - Async Compute PSO creation Change 3251036 on 2017/01/09 by Rolando.Caloca DR - Add r.AsyncPipelineCompile - Dispatch on any thread - Wait for completion event Change 3251058 on 2017/01/09 by Ben.Woodhouse Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN) #jira UE-40332 Change 3251141 on 2017/01/09 by Ben.Woodhouse Duplicated from Fortnite CL 3243458: D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory. On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage. Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX Change 3251142 on 2017/01/09 by Ben.Woodhouse Duplicated from Fortnite 3243496 memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time. Change 3252323 on 2017/01/10 by Rolando.Caloca DR - Gfx async PSO creation prep Change 3252474 on 2017/01/10 by Daniel.Wright Added 'Compile Unreal Lightmass' to error message Change 3252589 on 2017/01/10 by Daniel.Wright Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite Change 3252790 on 2017/01/10 by Daniel.Wright Added InscatteringColorCubemapAngle to exponential height fog Change 3252843 on 2017/01/10 by Uriel.Doyon Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways. The bound defrag is now done outside of the async work logic. Change 3252866 on 2017/01/10 by Mark.Satterthwaite Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future. #jira UE-40357 Change 3254511 on 2017/01/11 by Rolando.Caloca DR - PSO stats Change 3255958 on 2017/01/12 by Mark.Satterthwaite Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed. #jira UE-40554 Change 3256329 on 2017/01/12 by Olaf.Piesche #jira UE-38615 Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime. Change 3256371 on 2017/01/12 by Uriel.Doyon Reenabled texture streaming bound defrag as the fix is in CL 3252843 Change 3257032 on 2017/01/13 by Daniel.Wright Added fastClamp to fastmath.usf Change 3257111 on 2017/01/13 by Daniel.Wright Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game Change 3257112 on 2017/01/13 by Daniel.Wright DFAO optimizations * Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms) * Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms) * Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms * Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms) Change 3257113 on 2017/01/13 by Daniel.Wright Better distance field memory stats Change 3257326 on 2017/01/13 by Uriel.Doyon Workaround to support cases where several textures have the same lighting GUID. Change 3257448 on 2017/01/13 by Daniel.Wright Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles Change 3257616 on 2017/01/13 by Daniel.Wright Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable Change 3257657 on 2017/01/13 by Daniel.Wright Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU * 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms Change 3258063 on 2017/01/14 by Rolando.Caloca DR - vk - Refactor descriptor set reuse in prep for more changes Change 3258715 on 2017/01/16 by Daniel.Wright Added VisualizeGlobalDistanceField show flag Change 3258827 on 2017/01/16 by Daniel.Wright Global distance field update regions are clipped against others to reduce redundant updates. Change 3258959 on 2017/01/16 by Benjamin.Hyder Updating Planar Reflection example material in TM-Shadermodels Change 3259270 on 2017/01/16 by Daniel.Wright [Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons. Change 3259652 on 2017/01/16 by Uriel.Doyon Better support for static primitive becoming dynamic. Change 3260107 on 2017/01/17 by Ben.Woodhouse Fix FMonitoredProcess to prevent infinite loop in -nothreading mode #jira UE-40717 Change 3260594 on 2017/01/17 by Daniel.Wright Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary) * The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field * Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same. * Adds 12Mb for the new volume textures Change 3260956 on 2017/01/17 by Daniel.Wright Structured buffers for DF object data * Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads Change 3261296 on 2017/01/17 by Daniel.Wright Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms Change 3262036 on 2017/01/18 by Ben.Salem V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details. Change 3262056 on 2017/01/18 by Chris.Bunner Remove inverse tonemapping when rendering HDR output. #jira UE-40728 Change 3262661 on 2017/01/18 by Rolando.Caloca DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs - Fix hash for PSOs Change 3263674 on 2017/01/19 by Chris.Bunner PR #3144: Improved error messages (Contributed by DarkSlot) #jira UE-40835 Change 3264150 on 2017/01/19 by Ben.Woodhouse Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode #jira UE-40841 Change 3264153 on 2017/01/19 by Ben.Woodhouse Integrate latest changes from MS-DX12 CLs 3231395-3262526 - Added WinPixEventRuntime.tps - Includes PIX support, various optimizations (saved 1.3ms in testbed scene) CL 3262343: Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes: - Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case. CL 3231395: Update D3D12 RHI: - Fix deferred MSAA path in RHI - Add Pix3.h support - Cleanup SetName usage and remove it from shipping builds. - Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value. - Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64 - Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version. - Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident. Change 3264251 on 2017/01/19 by Mark.Satterthwaite Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions. #jira UE-40803 Change 3264642 on 2017/01/19 by Daniel.Wright Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096. Change 3265330 on 2017/01/20 by Ben.Salem Stop performance plugin from building in Win32. #tests recompiled and preflighted Change 3265678 on 2017/01/20 by Marcus.Wassmer Fix bad declaration. #3055 Change 3266656 on 2017/01/20 by Mark.Satterthwaite Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging). Duplicate & amend CL #3266053 from Trepka: Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled. Amend & integrate RCO's CL #3197085. Change 3267741 on 2017/01/23 by Rolando.Caloca DR - Detect duplicated shader and pipeline types Change 3268600 on 2017/01/23 by Uriel.Doyon Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings. Integrated CL 3227368 from Orion stream Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months. Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing, Added th MaxEffectiveScreenSize settings in the investigate texture command. Change 3269512 on 2017/01/24 by Richard.Wallis Fix for shader binary cache uncompress data size during internal shader log. Change 3271237 on 2017/01/25 by Ben.Woodhouse D3D12 updateTexture2D crash fix #jira UE-41059 Change 3271564 on 2017/01/25 by Olaf.Piesche #jira UE-40980 #udn 325525 Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe Change 3271594 on 2017/01/25 by Ben.Woodhouse ESRAM support stage 1: Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range. Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR #fyi rolando.caloca Change 3272616 on 2017/01/25 by Rolando.Caloca DR - Update shader version Change 3273138 on 2017/01/26 by Ben.Woodhouse Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main) [CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
template<bool bUseGlobalDistanceField>
class TVisualizeMeshDistanceFieldCS : public FGlobalShader
{
DECLARE_SHADER_TYPE(TVisualizeMeshDistanceFieldCS, Global);
public:
static bool ShouldCache(EShaderPlatform Platform)
{
return IsFeatureLevelSupported(Platform, ERHIFeatureLevel::SM5) && DoesPlatformSupportDistanceFieldAO(Platform);
}
static void ModifyCompilationEnvironment(EShaderPlatform Platform, FShaderCompilerEnvironment& OutEnvironment)
{
OutEnvironment.SetDefine(TEXT("DOWNSAMPLE_FACTOR"), GAODownsampleFactor);
OutEnvironment.SetDefine(TEXT("THREADGROUP_SIZEX"), GDistanceFieldAOTileSizeX);
OutEnvironment.SetDefine(TEXT("THREADGROUP_SIZEY"), GDistanceFieldAOTileSizeY);
OutEnvironment.SetDefine(TEXT("USE_GLOBAL_DISTANCE_FIELD"), bUseGlobalDistanceField);
}
/** Default constructor. */
TVisualizeMeshDistanceFieldCS() {}
/** Initialization constructor. */
TVisualizeMeshDistanceFieldCS(const ShaderMetaType::CompiledShaderInitializerType& Initializer)
: FGlobalShader(Initializer)
{
VisualizeMeshDistanceFields.Bind(Initializer.ParameterMap, TEXT("VisualizeMeshDistanceFields"));
NumGroups.Bind(Initializer.ParameterMap, TEXT("NumGroups"));
ObjectParameters.Bind(Initializer.ParameterMap);
DeferredParameters.Bind(Initializer.ParameterMap);
AOParameters.Bind(Initializer.ParameterMap);
GlobalDistanceFieldParameters.Bind(Initializer.ParameterMap);
}
void SetParameters(
FRHICommandList& RHICmdList,
const FSceneView& View,
FSceneRenderTargetItem& VisualizeMeshDistanceFieldsValue,
FVector2D NumGroupsValue,
const FDistanceFieldAOParameters& Parameters,
const FGlobalDistanceFieldInfo& GlobalDistanceFieldInfo)
{
const FComputeShaderRHIParamRef ShaderRHI = GetComputeShader();
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3357411) #lockdown Nick.Penwarden #rb none ========================== MAJOR FEATURES + CHANGES ========================== Change 3244756 on 2017/01/03 by Marcus.Wassmer Copying //Tasks/UE4/Dev-Niagara@3244743 to Dev-Rendering (//UE4/Dev-Rendering) Change 3248667 on 2017/01/05 by Olaf.Piesche Resaving default asset because of engine verison issue; maybe unnecessary, but resaving niagara engine content to be sure #jira UE-40160 Change 3249324 on 2017/01/06 by Marcus.Wassmer Resave with an actual version to stop cook warning Change 3249611 on 2017/01/06 by Marcus.Wassmer Just remove warning-causing niagara data for now. Change 3308052 on 2017/02/16 by Rolando.Caloca DR - Check for Vulkan SDK, and only use it if it's newer or the same as the headers we distribute Change 3308109 on 2017/02/16 by Rolando.Caloca DR - Upgrade glslang to 1.0.39.1 Change 3308111 on 2017/02/16 by Rolando.Caloca DR - Update Vulkan distribution to 1.0.39.1 Change 3308153 on 2017/02/16 by Rolando.Caloca DR - Updated glslang libs Change 3308842 on 2017/02/17 by Rolando.Caloca DR - Fixed copy/paste Change 3310007 on 2017/02/17 by Chris.Bunner Back out CL 3221219 - causing MIC generation issues and superseded by CL 3273971. #jira UE-37792 Change 3310154 on 2017/02/17 by Chris.Bunner Assert when attempting to add a custom material attribute already in the base attributes list. Change 3310155 on 2017/02/17 by Chris.Bunner PR #3231: Validate material index before accessing (Contributed by projectgheist) #jira UE-41774, UE-41788 Change 3310162 on 2017/02/17 by Chris.Bunner PR #3252: Added MobileMaterialInterface to UsedMaterials (Contributed by projectgheist) #jira UE-41823, UE-41950 Change 3310176 on 2017/02/17 by Chris.Bunner Merging CL 3233886: AMD HDR support (requires r.AMDSupportsHDRDisplayOutput=1 in ini). Update to AGS 5.0.5. Partial code tidy up. Change 3310187 on 2017/02/17 by Chris.Bunner Preserve constant expressions rather than always casting after translating a material attribute. Losing the notion of constant means we can't correctly detect used properties and falsely enable e.g. PDO. Happened because of the incorrect component masks in BreakMaterialNodes which then had to be downcast to the correct type which is done as an inline fragment rather than swizzle expression. #jira UE-41594 Change 3310215 on 2017/02/17 by Chris.Bunner Prevent SpeedTree node compiling for skeletal meshes (not supported as uses more UV sets than available). More descriptive error for missing Cubemap UV input on TextureSample material node . #jira UE-33098 Change 3310838 on 2017/02/18 by Joe.Graf Moved some private functions to public for a licensee #CodeReview: matt.kuhlenschmidt #rb: n/a Change 3311876 on 2017/02/20 by Rolando.Caloca DR - Expose skin cache cvar r.SkinCache.AccumulationBufferSizeInKB #jira UE-42014 Change 3314139 on 2017/02/21 by Rolando.Caloca DR - Minor cleanup pass - Remove FVulkanPendingState - Renamed some classes for clarity - Hoist pending UAVs for flush out to pending compute state Change 3314642 on 2017/02/21 by Rolando.Caloca DR - Some more renaming Change 3315431 on 2017/02/21 by Ben.Salem Properly set default values for test time out and tick. We now will default to ticking once per second, and tracking the macro stats of GPU/Render/Game thread time. #tests Ran showdown demo several times Change 3316710 on 2017/02/22 by Rolando.Caloca DR - hlslcc - Fix refract intrinsic Change 3316718 on 2017/02/22 by Rolando.Caloca DR - hlslcc - Built libs to pick up change from 3316710 - refract fix Change 3316820 on 2017/02/22 by Benjamin.Hyder updating Tm-TrigNodes map Change 3317192 on 2017/02/22 by Benjamin.Hyder Updating QA-Decals map Change 3317528 on 2017/02/22 by Benjamin.Hyder Updating QA-Decals map Change 3317639 on 2017/02/22 by Benjamin.Hyder Updating Decal on Complex Mesh example in QA-Decals Change 3317764 on 2017/02/22 by Benjamin.Hyder Final updates to QA-Decals Change 3318319 on 2017/02/22 by Rolando.Caloca DR - minor reorg/rename Change 3318379 on 2017/02/22 by Rolando.Caloca DR - more cleanup Change 3321181 on 2017/02/24 by Rolando.Caloca DR - Fix GL bug Change 3321247 on 2017/02/24 by Rolando.Caloca DR - Fix misc bugs Change 3321898 on 2017/02/24 by Chris.Bunner Only issue clear TLV dispatch if required. #jira UERNDR-193 Change 3321904 on 2017/02/24 by Chris.Bunner Added comment for potential future optimization. Change 3322013 on 2017/02/24 by Uriel.Doyon Fixed separate translucency being affected by Gaussian DOF #jira UE-40489 Change 3322517 on 2017/02/24 by Uriel.Doyon Fixed issue with InvestigateTexture command removing budget limit. Fixed StreamingBounds show flag not working. It nows shows the streaming bound for the currently selected textures. #jira UE-40485 Change 3323470 on 2017/02/27 by Chad.Garyet Removing DDC job from dev-rendering Change 3323479 on 2017/02/27 by Chad.Garyet Removing RDU agent type Change 3323519 on 2017/02/27 by Chad.Garyet removing NCL/LHR/SEA agent types to clean up space Change 3323639 on 2017/02/27 by Benjamin.Hyder More updates to QA-Decals Change 3324207 on 2017/02/27 by Uriel.Doyon Fixed typo ScaleTexturesByGlobalMyBias -> ScaleTexturesByGlobalMipBias Removed bad merge in FStreamingTextureLevelContext::GetBuildDataIndexRef Change 3324396 on 2017/02/27 by Uriel.Doyon Fixed an issue with the Streaming Bounds show flag interferring with the static level data initialization #jira UE-40485 Change 3325227 on 2017/02/28 by Chris.Bunner Fix-up AMD AGS libs. Change 3325566 on 2017/02/28 by Uriel.Doyon Fixed possible out-of-bound access in GetUsedTexture() when passing ERHIFeatureLevel::Num Change 3326009 on 2017/02/28 by Uriel.Doyon Better fix for 3325566, as the previous fix would ignore the material instance overrides. Change 3327058 on 2017/03/01 by Benjamin.Hyder Preparing TM_Shadermodels map for automation Change 3328222 on 2017/03/01 by Chris.Bunner Prevent decals from drawing in separate translucency pass. Whilst user control and material relevance were already removed, if the flag was checked before being disabled (by swapping to decal domain) this was still being read in the render loop, now explicitly ignores decals. #jira UE-42449, UE-42446 Change 3329848 on 2017/03/02 by Uriel.Doyon Added some extra logs to help track UE-42168 Change 3329977 on 2017/03/02 by Rolando.Caloca DR - Fix bad clear value Change 3330008 on 2017/03/02 by Benjamin.Hyder More preparations for QA-Decals automation Change 3330754 on 2017/03/02 by Daniel.Wright Prominent comment explaining reflection env async compute usage and why it's not overlapped with anything Change 3331451 on 2017/03/03 by Marc.Olano Manually unroll simplex noise loop to avoid PSO bug on AMD/Metal Change 3331839 on 2017/03/03 by Rolando.Caloca DR - hlslcc - add missing file to project Change 3332247 on 2017/03/03 by Rolando.Caloca DR - Fix for integrated intel PR #3305 #jira UE-42393 Change 3332259 on 2017/03/03 by Rolando.Caloca DR - Fix bad index into pixel formats PR #3237 #jira UE-41855 Change 3332305 on 2017/03/03 by Rolando.Caloca DR - OpenGL SRV for index buffers PR #3271 #jira UE-32618 Change 3332313 on 2017/03/03 by Rolando.Caloca DR - Fix for integrated intel (properly) PR #3305 #jira UE-42393 Change 3332317 on 2017/03/03 by Rolando.Caloca DR - OpenGL SRV for index buffers (properly) PR #3271 #jira UE-32618 Change 3332368 on 2017/03/03 by Rolando.Caloca DR - Minor fixes so -sm4 and -sm5 can be used on windows with OpenGL/Vulkan Change 3333690 on 2017/03/06 by Daniel.Wright [Copy] Changing movable skylight properties no longer affects static draw lists Change 3333693 on 2017/03/06 by Daniel.Wright [Copy] Added 'r.AOListMeshDistanceFields' which dumps out mesh distance fields sorted by memory size, useful for directing content optimizations Change 3333705 on 2017/03/06 by Daniel.Wright [Copy] Mesh distance fields are now 8 bit fixed point by default, but can be changed back to 16 bit floating piont with a project setting. * 8 bit uses half memory but introduces error for thin surfaces or large meshes. Change 3333721 on 2017/03/06 by David.Hill DecalProxy: Copy float FadeScreenSize to FDeferredDecalProxy for use in the render thread. This avoids pointer chasing to the UDecalComponent (game thread component). Change 3333772 on 2017/03/06 by Daniel.Wright [Copy] Scene motion blur data is only updated for the main renderer frames. Fixes scene captures and planar reflections breaking object motion blur. Change 3333790 on 2017/03/06 by Daniel.Wright [Copy] Mesh distance field generation uses Embree, for a 2.5x speedup * Can switch back to old kDOP generation with 'r.DistanceFieldBuild.UseEmbree 0' for debugging Change 3333822 on 2017/03/06 by Daniel.Wright [Copy] Moved mesh distance field code into MeshDistanceFieldUtilities.cpp * Moved FMeshUtilities to its own header so the 8k line MeshUtilites.cpp file can be further split up Change 3333827 on 2017/03/06 by Daniel.Wright [Copy] Range compress 8bit distance fields - gets one extra bit of precision on average Change 3333828 on 2017/03/06 by Daniel.Wright [Copy] Raised High ShadowQuality to 2048 as 1024 for CSM is way too low Change 3333831 on 2017/03/06 by Daniel.Wright Non-editor compile fix Change 3333836 on 2017/03/06 by Daniel.Wright [Copy] Workaround for gobal distance field volume textures being bloated by 4x on PS4 due to the recommended tiling modes. They now use a 2d tiling mode which avoids the bloat, saving 96Mb. Change 3333843 on 2017/03/06 by Daniel.Wright [Copy] Added OcclusionExponent to skylight component * Useful for brightening up indoors without losing contact shadows as MinOcclusion does Change 3333845 on 2017/03/06 by Daniel.Wright [Copy] Capsule shadow BP functions Change 3333850 on 2017/03/06 by Daniel.Wright [Copy] Added OcclusionCombineMode to skylight component Change 3333854 on 2017/03/06 by Daniel.Wright [Copy] Gnm properly registers clears as GPU work so those events show up in profilegpu Change 3333857 on 2017/03/06 by Daniel.Wright [Copy] Clear light attenuation for local lights with a quad covering their screen extents * Clearing the entire light attenuation buffer costs .1ms on PS4. This optimization lowers the minimum cost of a shadow casting light from .15ms -> .03ms. * Shadowed lights in Fortnite with 25 lights 3.7ms -> 1.42ms on PS4 Change 3333860 on 2017/03/06 by Daniel.Wright [Copy] Flush deferred deletes when reallocating distance field atlas to reduce peak memory Change 3333861 on 2017/03/06 by Daniel.Wright [Copy] Disable all distance field features on Intel cards as HD 4000 hangs in the RHICreateTexture3D call to allocate the large atlas Change 3333869 on 2017/03/06 by Daniel.Wright [Copy] Volumetric Fog using a volume texture mapped to the camera frustum * Volumetric fog can be enabled on an Exponential Height Fog component with additional controls * Lights have a VolumetricScatteringIntensity * New cvars r.VolumetricFog, r.VolumetricFog.GridPixelSize, r.VolumetricFog.GridSizeZ, r.VolumetricFog.DepthDistributionScale * Lighting features supported: * Directional light with CSM and a light function * Point / spot lights without shadows / light functions / IES profiles * Skylight with occlusion from distance fields * Analytical height fog covers the view range past where the volumetric fog ends * Temporal reprojection is used on the volumetric fog scattering and extinction to achieve stability * Translucency integrates properly into volumetric fog * Height fog StartDistance is not supported by volumetric fog and should be set to 0. Change 3333894 on 2017/03/06 by Daniel.Wright [Copy] Initialize GDummyVolumetricFogGlobalDataUniformBuffer outside of parallel rendering Change 3333902 on 2017/03/06 by Daniel.Wright [Copy] Better handling of volumetric fog enabled with distance of 0 Change 3333903 on 2017/03/06 by Daniel.Wright [Copy] Fixed volumetric fog trying to render light functions for a point light Change 3333908 on 2017/03/06 by Daniel.Wright [Copy] Volumetric materials * Added new material domain Volume, which can output Scattering, Absorption and Emissive. All properties are in world space densities. * Particle systems using the Volume domain are voxelized based on their ParticlePosition and ParticleRadius * Volumetric fog integration is now energy conservative - scattering is integrated against transmission over the depth of each slice. * Added bOverrideLightColorsWithFogInscatteringColors to exponential height fog, which can be enabled to make Volumetric Fog match Height fog more closely Change 3334134 on 2017/03/06 by Daniel.Wright [Copy from Michael Trepka] Added Embree 2.14.0 and changed MeshUtilities to use it as this solves issues with Embree leaking TLS keys. UnrealLightmass is still using older Embree 2.7.0 until we can find time to properly test it with the new version. Also, invalidated distance field DDC to force it to rebuild with updated Embree. Change 3334420 on 2017/03/06 by Daniel.Wright Fixed RTDF shadows Change 3335467 on 2017/03/07 by Benjamin.Hyder Initial submission of QA-Decals map to EngineTest Change 3335556 on 2017/03/07 by Daniel.Wright Changed mesh distance field default format back to R16f Change 3338020 on 2017/03/08 by Daniel.Wright Disable volumetric fog in vertex shaders for feature levels which don't support it Change 3339394 on 2017/03/09 by Chris.Bunner Correctly handle material texture translation error edge case. #jira UE-42579, UE-42670 Change 3339992 on 2017/03/09 by Daniel.Wright Only compile volumetric fog shaders on supporting platforms Change 3341858 on 2017/03/10 by Arne.Schober Copying //UE4/Dev-Rendering-PSO to Dev-Rendering (//UE4/Dev-Rendering) #RB Rolando.Caloca, Marcus.Wassmer, Daniel.Wright, Nick.Penwarden, Mark.Satterthwaite Change 3342004 on 2017/03/10 by Arne.Schober Copying //UE4/Dev-Rendering-PSO to Dev-Rendering (//UE4/Dev-Rendering) Fix unity build #RB Marcus.Wassmer Change 3343307 on 2017/03/13 by Marcus.Wassmer Update showflags when we are guaranteed it will happen in all possible ways to spawn the scenecapture. (drag into editor, PIE, -game, etc) Change 3343732 on 2017/03/13 by Rolando.Caloca DR - Vulkan compute pipeline & refactor Change 3344846 on 2017/03/14 by Rolando.Caloca DR - Android compile fixes Change 3344883 on 2017/03/14 by Rolando.Caloca DR - Add missing stencil load/store to PSO initializer Change 3344985 on 2017/03/14 by Rolando.Caloca DR - Made load/store actions uint8 Change 3345141 on 2017/03/14 by Rolando.Caloca DR - vk - Rework render pass hash Change 3345304 on 2017/03/14 by Benjamin.Hyder Updating TM-Distancefields map to include TemplateFloor mesh Change 3345387 on 2017/03/14 by Rolando.Caloca DR - Add _RenderThread calls for Create*Shader so RHIs can choose not to stall when creating Change 3345388 on 2017/03/14 by Rolando.Caloca DR - Do not stall when creating shaders on Vulkan Change 3345722 on 2017/03/14 by Chris.Bunner PR #3357: MinimalAPI add to many material expressions (Contributed by DeanoC) #jira UE-42752 Change 3345723 on 2017/03/14 by Chris.Bunner Reduce log verbosity causing spamming during landscape editing. #jira UE-42714 Change 3345725 on 2017/03/14 by Chris.Bunner [Duplicate 3341860] Fixed material translation error with multiple connections from custom interpolator nodes. Change 3345726 on 2017/03/14 by Chris.Bunner Typo fixes. Change 3345732 on 2017/03/14 by Rolando.Caloca DR - Decouple vertex declaration off BSS Change 3345746 on 2017/03/14 by Chris.Bunner Added sign() intrinsic material graph node and delisted material function workaround. Change 3346042 on 2017/03/14 by Chris.Bunner Implement missing size query interface for FRenderTargetResources. #jira UE-41672 Change 3346387 on 2017/03/14 by Daniel.Wright [Copy] Added VolumetricScatteringIntensity to particle lights Change 3346389 on 2017/03/14 by Daniel.Wright [Copy] Clamp Volumetric material attributes to fp16 range to avoid INFs Disable volumetric fog when the fog show flag is disabled Change 3346392 on 2017/03/14 by Daniel.Wright [Copy] Fixed skylight being much too bright on volumetric fog Change 3346406 on 2017/03/14 by Daniel.Wright [Copy] CSM resolution is now controlled by r.Shadow.MaxCSMResolution. * Changed HighPC to use 1024 MaxShadowResolution (max for all non-CSM shadows), saves 60Mb in Fortnite Change 3346412 on 2017/03/14 by Daniel.Wright [Copy] TexCreate_ReduceMemoryWithTilingMode for translucency lighting 3d textures, saves 13Mb Change 3346414 on 2017/03/14 by Daniel.Wright [Copy] TexCreate_ReduceMemoryWithTilingMode for volumetric fog 3d textures, saves 13Mb Change 3346415 on 2017/03/14 by Daniel.Wright [Copy] Missing file from cl 3338451 Change 3346421 on 2017/03/14 by Daniel.Wright [Copy] Fixed NaNs in volumetric fog due to rendering when height fog is disabled * Volumetric fog converts NaNs to black now so they don't spread Change 3346422 on 2017/03/14 by Daniel.Wright [Copy] Fixed NaN in volumetric fog with low density values Change 3346423 on 2017/03/14 by Daniel.Wright [Copy] Changed default VolumetricFogScatteringDistribution to .2 Change 3346430 on 2017/03/14 by Daniel.Wright [Copy] New translucent material option to compute fog per pixel instead of the default per vertex Change 3346432 on 2017/03/14 by Daniel.Wright [Copy] Moved Volumetric Fog parameters to view uniform buffer for translucency pass Fixed lifetimes of temporary Volumetric Fog render targets Change 3346526 on 2017/03/14 by Daniel.Wright [Copy] Volumetric Fog supports point and spot light shadows * These lights are injected separately so that per-light resources can be bound (shadow depth map, static shadow depth map) * Forward lighting of local lights can be forced with 'r.VolumetricFog.InjectShadowedLightsSeparately 0' * Shadowed lights come at a cost: 2.9ms for volumetric fog on 970 -> 4.2ms with shadowing Change 3347053 on 2017/03/15 by Rolando.Caloca DR - android compile fix Change 3347384 on 2017/03/15 by Rolando.Caloca DR - Fix merge issue Change 3347643 on 2017/03/15 by Marcus.Wassmer Fix some bugs with the 'disable stationary skylight ffor the project' feature. Fixes lighting in Persona on Paragon. Change 3347979 on 2017/03/15 by Rolando.Caloca DR - Allow to automatically apply cached rendertargets to PSO initializer Change 3348024 on 2017/03/15 by Rolando.Caloca DR - Remove NullPS on Vulkan to avoid deadlock Change 3348303 on 2017/03/15 by Rolando.Caloca DR - Fix for debugging SCW with material SRT Change 3348357 on 2017/03/15 by Marcus.Wassmer Fix stencildither and a stencilref bug that was probably breaking decals sometimes. Change 3348549 on 2017/03/15 by Marcus.Wassmer Hopefully fix static analysis for potential nullptr access. Change 3348614 on 2017/03/15 by Marcus.Wassmer Duplicate some switch changes to fix crash on launch. Change 3349369 on 2017/03/16 by Gil.Gribb Fixed botched merge Change 3349947 on 2017/03/16 by Rolando.Caloca DR - Fix for mismatched primitive type Change 3349956 on 2017/03/16 by Benjamin.Hyder initial updates to TM-DistanceFields map Change 3350151 on 2017/03/16 by Rolando.Caloca DR - Fix UT compile issue Change 3350155 on 2017/03/16 by Rolando.Caloca DR - Catch mismatched primitive type on PSOs on D3D11 Change 3350192 on 2017/03/16 by Daniel.Wright Fix for point light shadow depths rendering with wrong cull mode due to PSO refactor Change 3350736 on 2017/03/16 by Daniel.Wright Fixed formatting from merge Change 3350881 on 2017/03/16 by Rolando.Caloca DR - Fix texture arrays as UAVs on Metal Change 3350927 on 2017/03/16 by Rolando.Caloca DR - Fix warning Change 3350935 on 2017/03/16 by Daniel.Wright Fix for materials with non-Surface domains being skipped in mesh passes Change 3351583 on 2017/03/17 by Marcus.Wassmer Fix clang platforms Change 3351917 on 2017/03/17 by Marcus.Wassmer Fix linux compile Change 3351973 on 2017/03/17 by Marcus.Wassmer Fix mismatched rendertargetformat Change 3352038 on 2017/03/17 by Daniel.Wright Enabled GetAndOrCreateGraphicsPipelineState ensures in Development for testing Change 3352110 on 2017/03/17 by Marcus.Wassmer Fix missing RT PSO apply Change 3352695 on 2017/03/17 by Arne.Schober DR - Remove PSO Rendertarget check in DX12 Resolve with Shader. #RB Rolando.Caloca Change 3352960 on 2017/03/17 by Arne.Schober DR - Fix some things that slipped trough the PSO merge #RB none Change 3353150 on 2017/03/18 by Rolando.Caloca DR - compile fix Change 3353205 on 2017/03/18 by Arne.Schober DR - Fix Incremental Compile and PS4 runtime error where CMASK is not allowed for ThickTile Mode #RB none Change 3353207 on 2017/03/18 by Arne.Schober DR - Fix Confusion #RB none Change 3355183 on 2017/03/20 by Nick.Bullard Fixed up Content orginzation for Decals automation tests in EngineTest Change 3355627 on 2017/03/20 by Arne.Schober DR - [UE-43094] - removed ensure in comporiton graph as control of the clear color cannot be gurantueed. Change 3356342 on 2017/03/21 by Marcus.Wassmer Fix clang errors Change 3356591 on 2017/03/21 by Arne.Schober DR - Fix ensure message #RB none Change 3356873 on 2017/03/21 by Arne.Schober DR - Fix comparission of undefined values in RendertargetApply Check Change 3357261 on 2017/03/21 by Marcus.Wassmer Fix LinuxEditor compile Change 3357294 on 2017/03/21 by Marcus.Wassmer Add missing SSE functions Change 3357351 on 2017/03/21 by Frank.Fella Fix win32 and linux compiler errors Change 3357370 on 2017/03/21 by Arne.Schober DR - disable ensure in test builds #RB Marcus.Wassmer [CL 3357449 by Marcus Wassmer in Main branch]
2017-03-21 17:46:52 -04:00
FGlobalShader::SetParameters<FViewUniformShaderParameters>(RHICmdList, ShaderRHI, View.ViewUniformBuffer);
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304) #lockdown Nick.Penwarden #rb none ========================== MAJOR FEATURES + CHANGES ========================== Change 3250856 on 2017/01/09 by Daniel.Wright Only showing instruction count for 'Base pass shader' now Change 3250943 on 2017/01/09 by Rolando.Caloca DR - Async Compute PSO creation Change 3251036 on 2017/01/09 by Rolando.Caloca DR - Add r.AsyncPipelineCompile - Dispatch on any thread - Wait for completion event Change 3251058 on 2017/01/09 by Ben.Woodhouse Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN) #jira UE-40332 Change 3251141 on 2017/01/09 by Ben.Woodhouse Duplicated from Fortnite CL 3243458: D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory. On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage. Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX Change 3251142 on 2017/01/09 by Ben.Woodhouse Duplicated from Fortnite 3243496 memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time. Change 3252323 on 2017/01/10 by Rolando.Caloca DR - Gfx async PSO creation prep Change 3252474 on 2017/01/10 by Daniel.Wright Added 'Compile Unreal Lightmass' to error message Change 3252589 on 2017/01/10 by Daniel.Wright Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite Change 3252790 on 2017/01/10 by Daniel.Wright Added InscatteringColorCubemapAngle to exponential height fog Change 3252843 on 2017/01/10 by Uriel.Doyon Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways. The bound defrag is now done outside of the async work logic. Change 3252866 on 2017/01/10 by Mark.Satterthwaite Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future. #jira UE-40357 Change 3254511 on 2017/01/11 by Rolando.Caloca DR - PSO stats Change 3255958 on 2017/01/12 by Mark.Satterthwaite Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed. #jira UE-40554 Change 3256329 on 2017/01/12 by Olaf.Piesche #jira UE-38615 Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime. Change 3256371 on 2017/01/12 by Uriel.Doyon Reenabled texture streaming bound defrag as the fix is in CL 3252843 Change 3257032 on 2017/01/13 by Daniel.Wright Added fastClamp to fastmath.usf Change 3257111 on 2017/01/13 by Daniel.Wright Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game Change 3257112 on 2017/01/13 by Daniel.Wright DFAO optimizations * Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms) * Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms) * Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms * Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms) Change 3257113 on 2017/01/13 by Daniel.Wright Better distance field memory stats Change 3257326 on 2017/01/13 by Uriel.Doyon Workaround to support cases where several textures have the same lighting GUID. Change 3257448 on 2017/01/13 by Daniel.Wright Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles Change 3257616 on 2017/01/13 by Daniel.Wright Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable Change 3257657 on 2017/01/13 by Daniel.Wright Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU * 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms Change 3258063 on 2017/01/14 by Rolando.Caloca DR - vk - Refactor descriptor set reuse in prep for more changes Change 3258715 on 2017/01/16 by Daniel.Wright Added VisualizeGlobalDistanceField show flag Change 3258827 on 2017/01/16 by Daniel.Wright Global distance field update regions are clipped against others to reduce redundant updates. Change 3258959 on 2017/01/16 by Benjamin.Hyder Updating Planar Reflection example material in TM-Shadermodels Change 3259270 on 2017/01/16 by Daniel.Wright [Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons. Change 3259652 on 2017/01/16 by Uriel.Doyon Better support for static primitive becoming dynamic. Change 3260107 on 2017/01/17 by Ben.Woodhouse Fix FMonitoredProcess to prevent infinite loop in -nothreading mode #jira UE-40717 Change 3260594 on 2017/01/17 by Daniel.Wright Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary) * The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field * Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same. * Adds 12Mb for the new volume textures Change 3260956 on 2017/01/17 by Daniel.Wright Structured buffers for DF object data * Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads Change 3261296 on 2017/01/17 by Daniel.Wright Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms Change 3262036 on 2017/01/18 by Ben.Salem V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details. Change 3262056 on 2017/01/18 by Chris.Bunner Remove inverse tonemapping when rendering HDR output. #jira UE-40728 Change 3262661 on 2017/01/18 by Rolando.Caloca DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs - Fix hash for PSOs Change 3263674 on 2017/01/19 by Chris.Bunner PR #3144: Improved error messages (Contributed by DarkSlot) #jira UE-40835 Change 3264150 on 2017/01/19 by Ben.Woodhouse Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode #jira UE-40841 Change 3264153 on 2017/01/19 by Ben.Woodhouse Integrate latest changes from MS-DX12 CLs 3231395-3262526 - Added WinPixEventRuntime.tps - Includes PIX support, various optimizations (saved 1.3ms in testbed scene) CL 3262343: Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes: - Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case. CL 3231395: Update D3D12 RHI: - Fix deferred MSAA path in RHI - Add Pix3.h support - Cleanup SetName usage and remove it from shipping builds. - Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value. - Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64 - Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version. - Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident. Change 3264251 on 2017/01/19 by Mark.Satterthwaite Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions. #jira UE-40803 Change 3264642 on 2017/01/19 by Daniel.Wright Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096. Change 3265330 on 2017/01/20 by Ben.Salem Stop performance plugin from building in Win32. #tests recompiled and preflighted Change 3265678 on 2017/01/20 by Marcus.Wassmer Fix bad declaration. #3055 Change 3266656 on 2017/01/20 by Mark.Satterthwaite Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging). Duplicate & amend CL #3266053 from Trepka: Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled. Amend & integrate RCO's CL #3197085. Change 3267741 on 2017/01/23 by Rolando.Caloca DR - Detect duplicated shader and pipeline types Change 3268600 on 2017/01/23 by Uriel.Doyon Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings. Integrated CL 3227368 from Orion stream Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months. Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing, Added th MaxEffectiveScreenSize settings in the investigate texture command. Change 3269512 on 2017/01/24 by Richard.Wallis Fix for shader binary cache uncompress data size during internal shader log. Change 3271237 on 2017/01/25 by Ben.Woodhouse D3D12 updateTexture2D crash fix #jira UE-41059 Change 3271564 on 2017/01/25 by Olaf.Piesche #jira UE-40980 #udn 325525 Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe Change 3271594 on 2017/01/25 by Ben.Woodhouse ESRAM support stage 1: Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range. Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR #fyi rolando.caloca Change 3272616 on 2017/01/25 by Rolando.Caloca DR - Update shader version Change 3273138 on 2017/01/26 by Ben.Woodhouse Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main) [CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
RHICmdList.TransitionResource(EResourceTransitionAccess::ERWBarrier, EResourceTransitionPipeline::EComputeToCompute, VisualizeMeshDistanceFieldsValue.UAV);
VisualizeMeshDistanceFields.SetTexture(RHICmdList, ShaderRHI, VisualizeMeshDistanceFieldsValue.ShaderResourceTexture, VisualizeMeshDistanceFieldsValue.UAV);
ObjectParameters.Set(RHICmdList, ShaderRHI, GAOCulledObjectBuffers.Buffers);
AOParameters.Set(RHICmdList, ShaderRHI, Parameters);
DeferredParameters.Set(RHICmdList, ShaderRHI, View);
if (bUseGlobalDistanceField)
{
GlobalDistanceFieldParameters.Set(RHICmdList, ShaderRHI, GlobalDistanceFieldInfo.ParameterData);
}
SetShaderValue(RHICmdList, ShaderRHI, NumGroups, NumGroupsValue);
}
void UnsetParameters(FRHICommandList& RHICmdList, FSceneRenderTargetItem& VisualizeMeshDistanceFieldsValue)
{
RHICmdList.TransitionResource(EResourceTransitionAccess::EReadable, EResourceTransitionPipeline::EComputeToCompute, VisualizeMeshDistanceFieldsValue.UAV);
VisualizeMeshDistanceFields.UnsetUAV(RHICmdList, GetComputeShader());
}
// FShader interface.
virtual bool Serialize(FArchive& Ar) override
{
bool bShaderHasOutdatedParameters = FGlobalShader::Serialize(Ar);
Ar << VisualizeMeshDistanceFields;
Ar << NumGroups;
Ar << ObjectParameters;
Ar << DeferredParameters;
Ar << AOParameters;
Ar << GlobalDistanceFieldParameters;
return bShaderHasOutdatedParameters;
}
private:
FRWShaderParameter VisualizeMeshDistanceFields;
FShaderParameter NumGroups;
FDistanceFieldCulledObjectBufferParameters ObjectParameters;
FDeferredPixelShaderParameters DeferredParameters;
FAOParameters AOParameters;
FGlobalDistanceFieldParameters GlobalDistanceFieldParameters;
};
IMPLEMENT_SHADER_TYPE(template<>,TVisualizeMeshDistanceFieldCS<true>,TEXT("DistanceFieldVisualization"),TEXT("VisualizeMeshDistanceFieldCS"),SF_Compute);
IMPLEMENT_SHADER_TYPE(template<>,TVisualizeMeshDistanceFieldCS<false>,TEXT("DistanceFieldVisualization"),TEXT("VisualizeMeshDistanceFieldCS"),SF_Compute);
class FVisualizeDistanceFieldUpsamplePS : public FGlobalShader
{
DECLARE_SHADER_TYPE(FVisualizeDistanceFieldUpsamplePS, Global);
public:
static bool ShouldCache(EShaderPlatform Platform)
{
return IsFeatureLevelSupported(Platform, ERHIFeatureLevel::SM5) && DoesPlatformSupportDistanceFieldAO(Platform);
}
static void ModifyCompilationEnvironment(EShaderPlatform Platform, FShaderCompilerEnvironment& OutEnvironment)
{
OutEnvironment.SetDefine(TEXT("DOWNSAMPLE_FACTOR"), GAODownsampleFactor);
}
/** Default constructor. */
FVisualizeDistanceFieldUpsamplePS() {}
/** Initialization constructor. */
FVisualizeDistanceFieldUpsamplePS(const ShaderMetaType::CompiledShaderInitializerType& Initializer)
: FGlobalShader(Initializer)
{
DeferredParameters.Bind(Initializer.ParameterMap);
VisualizeDistanceFieldTexture.Bind(Initializer.ParameterMap,TEXT("VisualizeDistanceFieldTexture"));
VisualizeDistanceFieldSampler.Bind(Initializer.ParameterMap,TEXT("VisualizeDistanceFieldSampler"));
}
void SetParameters(FRHICommandList& RHICmdList, const FSceneView& View, TRefCountPtr<IPooledRenderTarget>& VisualizeDistanceField)
{
const FPixelShaderRHIParamRef ShaderRHI = GetPixelShader();
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3357411) #lockdown Nick.Penwarden #rb none ========================== MAJOR FEATURES + CHANGES ========================== Change 3244756 on 2017/01/03 by Marcus.Wassmer Copying //Tasks/UE4/Dev-Niagara@3244743 to Dev-Rendering (//UE4/Dev-Rendering) Change 3248667 on 2017/01/05 by Olaf.Piesche Resaving default asset because of engine verison issue; maybe unnecessary, but resaving niagara engine content to be sure #jira UE-40160 Change 3249324 on 2017/01/06 by Marcus.Wassmer Resave with an actual version to stop cook warning Change 3249611 on 2017/01/06 by Marcus.Wassmer Just remove warning-causing niagara data for now. Change 3308052 on 2017/02/16 by Rolando.Caloca DR - Check for Vulkan SDK, and only use it if it's newer or the same as the headers we distribute Change 3308109 on 2017/02/16 by Rolando.Caloca DR - Upgrade glslang to 1.0.39.1 Change 3308111 on 2017/02/16 by Rolando.Caloca DR - Update Vulkan distribution to 1.0.39.1 Change 3308153 on 2017/02/16 by Rolando.Caloca DR - Updated glslang libs Change 3308842 on 2017/02/17 by Rolando.Caloca DR - Fixed copy/paste Change 3310007 on 2017/02/17 by Chris.Bunner Back out CL 3221219 - causing MIC generation issues and superseded by CL 3273971. #jira UE-37792 Change 3310154 on 2017/02/17 by Chris.Bunner Assert when attempting to add a custom material attribute already in the base attributes list. Change 3310155 on 2017/02/17 by Chris.Bunner PR #3231: Validate material index before accessing (Contributed by projectgheist) #jira UE-41774, UE-41788 Change 3310162 on 2017/02/17 by Chris.Bunner PR #3252: Added MobileMaterialInterface to UsedMaterials (Contributed by projectgheist) #jira UE-41823, UE-41950 Change 3310176 on 2017/02/17 by Chris.Bunner Merging CL 3233886: AMD HDR support (requires r.AMDSupportsHDRDisplayOutput=1 in ini). Update to AGS 5.0.5. Partial code tidy up. Change 3310187 on 2017/02/17 by Chris.Bunner Preserve constant expressions rather than always casting after translating a material attribute. Losing the notion of constant means we can't correctly detect used properties and falsely enable e.g. PDO. Happened because of the incorrect component masks in BreakMaterialNodes which then had to be downcast to the correct type which is done as an inline fragment rather than swizzle expression. #jira UE-41594 Change 3310215 on 2017/02/17 by Chris.Bunner Prevent SpeedTree node compiling for skeletal meshes (not supported as uses more UV sets than available). More descriptive error for missing Cubemap UV input on TextureSample material node . #jira UE-33098 Change 3310838 on 2017/02/18 by Joe.Graf Moved some private functions to public for a licensee #CodeReview: matt.kuhlenschmidt #rb: n/a Change 3311876 on 2017/02/20 by Rolando.Caloca DR - Expose skin cache cvar r.SkinCache.AccumulationBufferSizeInKB #jira UE-42014 Change 3314139 on 2017/02/21 by Rolando.Caloca DR - Minor cleanup pass - Remove FVulkanPendingState - Renamed some classes for clarity - Hoist pending UAVs for flush out to pending compute state Change 3314642 on 2017/02/21 by Rolando.Caloca DR - Some more renaming Change 3315431 on 2017/02/21 by Ben.Salem Properly set default values for test time out and tick. We now will default to ticking once per second, and tracking the macro stats of GPU/Render/Game thread time. #tests Ran showdown demo several times Change 3316710 on 2017/02/22 by Rolando.Caloca DR - hlslcc - Fix refract intrinsic Change 3316718 on 2017/02/22 by Rolando.Caloca DR - hlslcc - Built libs to pick up change from 3316710 - refract fix Change 3316820 on 2017/02/22 by Benjamin.Hyder updating Tm-TrigNodes map Change 3317192 on 2017/02/22 by Benjamin.Hyder Updating QA-Decals map Change 3317528 on 2017/02/22 by Benjamin.Hyder Updating QA-Decals map Change 3317639 on 2017/02/22 by Benjamin.Hyder Updating Decal on Complex Mesh example in QA-Decals Change 3317764 on 2017/02/22 by Benjamin.Hyder Final updates to QA-Decals Change 3318319 on 2017/02/22 by Rolando.Caloca DR - minor reorg/rename Change 3318379 on 2017/02/22 by Rolando.Caloca DR - more cleanup Change 3321181 on 2017/02/24 by Rolando.Caloca DR - Fix GL bug Change 3321247 on 2017/02/24 by Rolando.Caloca DR - Fix misc bugs Change 3321898 on 2017/02/24 by Chris.Bunner Only issue clear TLV dispatch if required. #jira UERNDR-193 Change 3321904 on 2017/02/24 by Chris.Bunner Added comment for potential future optimization. Change 3322013 on 2017/02/24 by Uriel.Doyon Fixed separate translucency being affected by Gaussian DOF #jira UE-40489 Change 3322517 on 2017/02/24 by Uriel.Doyon Fixed issue with InvestigateTexture command removing budget limit. Fixed StreamingBounds show flag not working. It nows shows the streaming bound for the currently selected textures. #jira UE-40485 Change 3323470 on 2017/02/27 by Chad.Garyet Removing DDC job from dev-rendering Change 3323479 on 2017/02/27 by Chad.Garyet Removing RDU agent type Change 3323519 on 2017/02/27 by Chad.Garyet removing NCL/LHR/SEA agent types to clean up space Change 3323639 on 2017/02/27 by Benjamin.Hyder More updates to QA-Decals Change 3324207 on 2017/02/27 by Uriel.Doyon Fixed typo ScaleTexturesByGlobalMyBias -> ScaleTexturesByGlobalMipBias Removed bad merge in FStreamingTextureLevelContext::GetBuildDataIndexRef Change 3324396 on 2017/02/27 by Uriel.Doyon Fixed an issue with the Streaming Bounds show flag interferring with the static level data initialization #jira UE-40485 Change 3325227 on 2017/02/28 by Chris.Bunner Fix-up AMD AGS libs. Change 3325566 on 2017/02/28 by Uriel.Doyon Fixed possible out-of-bound access in GetUsedTexture() when passing ERHIFeatureLevel::Num Change 3326009 on 2017/02/28 by Uriel.Doyon Better fix for 3325566, as the previous fix would ignore the material instance overrides. Change 3327058 on 2017/03/01 by Benjamin.Hyder Preparing TM_Shadermodels map for automation Change 3328222 on 2017/03/01 by Chris.Bunner Prevent decals from drawing in separate translucency pass. Whilst user control and material relevance were already removed, if the flag was checked before being disabled (by swapping to decal domain) this was still being read in the render loop, now explicitly ignores decals. #jira UE-42449, UE-42446 Change 3329848 on 2017/03/02 by Uriel.Doyon Added some extra logs to help track UE-42168 Change 3329977 on 2017/03/02 by Rolando.Caloca DR - Fix bad clear value Change 3330008 on 2017/03/02 by Benjamin.Hyder More preparations for QA-Decals automation Change 3330754 on 2017/03/02 by Daniel.Wright Prominent comment explaining reflection env async compute usage and why it's not overlapped with anything Change 3331451 on 2017/03/03 by Marc.Olano Manually unroll simplex noise loop to avoid PSO bug on AMD/Metal Change 3331839 on 2017/03/03 by Rolando.Caloca DR - hlslcc - add missing file to project Change 3332247 on 2017/03/03 by Rolando.Caloca DR - Fix for integrated intel PR #3305 #jira UE-42393 Change 3332259 on 2017/03/03 by Rolando.Caloca DR - Fix bad index into pixel formats PR #3237 #jira UE-41855 Change 3332305 on 2017/03/03 by Rolando.Caloca DR - OpenGL SRV for index buffers PR #3271 #jira UE-32618 Change 3332313 on 2017/03/03 by Rolando.Caloca DR - Fix for integrated intel (properly) PR #3305 #jira UE-42393 Change 3332317 on 2017/03/03 by Rolando.Caloca DR - OpenGL SRV for index buffers (properly) PR #3271 #jira UE-32618 Change 3332368 on 2017/03/03 by Rolando.Caloca DR - Minor fixes so -sm4 and -sm5 can be used on windows with OpenGL/Vulkan Change 3333690 on 2017/03/06 by Daniel.Wright [Copy] Changing movable skylight properties no longer affects static draw lists Change 3333693 on 2017/03/06 by Daniel.Wright [Copy] Added 'r.AOListMeshDistanceFields' which dumps out mesh distance fields sorted by memory size, useful for directing content optimizations Change 3333705 on 2017/03/06 by Daniel.Wright [Copy] Mesh distance fields are now 8 bit fixed point by default, but can be changed back to 16 bit floating piont with a project setting. * 8 bit uses half memory but introduces error for thin surfaces or large meshes. Change 3333721 on 2017/03/06 by David.Hill DecalProxy: Copy float FadeScreenSize to FDeferredDecalProxy for use in the render thread. This avoids pointer chasing to the UDecalComponent (game thread component). Change 3333772 on 2017/03/06 by Daniel.Wright [Copy] Scene motion blur data is only updated for the main renderer frames. Fixes scene captures and planar reflections breaking object motion blur. Change 3333790 on 2017/03/06 by Daniel.Wright [Copy] Mesh distance field generation uses Embree, for a 2.5x speedup * Can switch back to old kDOP generation with 'r.DistanceFieldBuild.UseEmbree 0' for debugging Change 3333822 on 2017/03/06 by Daniel.Wright [Copy] Moved mesh distance field code into MeshDistanceFieldUtilities.cpp * Moved FMeshUtilities to its own header so the 8k line MeshUtilites.cpp file can be further split up Change 3333827 on 2017/03/06 by Daniel.Wright [Copy] Range compress 8bit distance fields - gets one extra bit of precision on average Change 3333828 on 2017/03/06 by Daniel.Wright [Copy] Raised High ShadowQuality to 2048 as 1024 for CSM is way too low Change 3333831 on 2017/03/06 by Daniel.Wright Non-editor compile fix Change 3333836 on 2017/03/06 by Daniel.Wright [Copy] Workaround for gobal distance field volume textures being bloated by 4x on PS4 due to the recommended tiling modes. They now use a 2d tiling mode which avoids the bloat, saving 96Mb. Change 3333843 on 2017/03/06 by Daniel.Wright [Copy] Added OcclusionExponent to skylight component * Useful for brightening up indoors without losing contact shadows as MinOcclusion does Change 3333845 on 2017/03/06 by Daniel.Wright [Copy] Capsule shadow BP functions Change 3333850 on 2017/03/06 by Daniel.Wright [Copy] Added OcclusionCombineMode to skylight component Change 3333854 on 2017/03/06 by Daniel.Wright [Copy] Gnm properly registers clears as GPU work so those events show up in profilegpu Change 3333857 on 2017/03/06 by Daniel.Wright [Copy] Clear light attenuation for local lights with a quad covering their screen extents * Clearing the entire light attenuation buffer costs .1ms on PS4. This optimization lowers the minimum cost of a shadow casting light from .15ms -> .03ms. * Shadowed lights in Fortnite with 25 lights 3.7ms -> 1.42ms on PS4 Change 3333860 on 2017/03/06 by Daniel.Wright [Copy] Flush deferred deletes when reallocating distance field atlas to reduce peak memory Change 3333861 on 2017/03/06 by Daniel.Wright [Copy] Disable all distance field features on Intel cards as HD 4000 hangs in the RHICreateTexture3D call to allocate the large atlas Change 3333869 on 2017/03/06 by Daniel.Wright [Copy] Volumetric Fog using a volume texture mapped to the camera frustum * Volumetric fog can be enabled on an Exponential Height Fog component with additional controls * Lights have a VolumetricScatteringIntensity * New cvars r.VolumetricFog, r.VolumetricFog.GridPixelSize, r.VolumetricFog.GridSizeZ, r.VolumetricFog.DepthDistributionScale * Lighting features supported: * Directional light with CSM and a light function * Point / spot lights without shadows / light functions / IES profiles * Skylight with occlusion from distance fields * Analytical height fog covers the view range past where the volumetric fog ends * Temporal reprojection is used on the volumetric fog scattering and extinction to achieve stability * Translucency integrates properly into volumetric fog * Height fog StartDistance is not supported by volumetric fog and should be set to 0. Change 3333894 on 2017/03/06 by Daniel.Wright [Copy] Initialize GDummyVolumetricFogGlobalDataUniformBuffer outside of parallel rendering Change 3333902 on 2017/03/06 by Daniel.Wright [Copy] Better handling of volumetric fog enabled with distance of 0 Change 3333903 on 2017/03/06 by Daniel.Wright [Copy] Fixed volumetric fog trying to render light functions for a point light Change 3333908 on 2017/03/06 by Daniel.Wright [Copy] Volumetric materials * Added new material domain Volume, which can output Scattering, Absorption and Emissive. All properties are in world space densities. * Particle systems using the Volume domain are voxelized based on their ParticlePosition and ParticleRadius * Volumetric fog integration is now energy conservative - scattering is integrated against transmission over the depth of each slice. * Added bOverrideLightColorsWithFogInscatteringColors to exponential height fog, which can be enabled to make Volumetric Fog match Height fog more closely Change 3334134 on 2017/03/06 by Daniel.Wright [Copy from Michael Trepka] Added Embree 2.14.0 and changed MeshUtilities to use it as this solves issues with Embree leaking TLS keys. UnrealLightmass is still using older Embree 2.7.0 until we can find time to properly test it with the new version. Also, invalidated distance field DDC to force it to rebuild with updated Embree. Change 3334420 on 2017/03/06 by Daniel.Wright Fixed RTDF shadows Change 3335467 on 2017/03/07 by Benjamin.Hyder Initial submission of QA-Decals map to EngineTest Change 3335556 on 2017/03/07 by Daniel.Wright Changed mesh distance field default format back to R16f Change 3338020 on 2017/03/08 by Daniel.Wright Disable volumetric fog in vertex shaders for feature levels which don't support it Change 3339394 on 2017/03/09 by Chris.Bunner Correctly handle material texture translation error edge case. #jira UE-42579, UE-42670 Change 3339992 on 2017/03/09 by Daniel.Wright Only compile volumetric fog shaders on supporting platforms Change 3341858 on 2017/03/10 by Arne.Schober Copying //UE4/Dev-Rendering-PSO to Dev-Rendering (//UE4/Dev-Rendering) #RB Rolando.Caloca, Marcus.Wassmer, Daniel.Wright, Nick.Penwarden, Mark.Satterthwaite Change 3342004 on 2017/03/10 by Arne.Schober Copying //UE4/Dev-Rendering-PSO to Dev-Rendering (//UE4/Dev-Rendering) Fix unity build #RB Marcus.Wassmer Change 3343307 on 2017/03/13 by Marcus.Wassmer Update showflags when we are guaranteed it will happen in all possible ways to spawn the scenecapture. (drag into editor, PIE, -game, etc) Change 3343732 on 2017/03/13 by Rolando.Caloca DR - Vulkan compute pipeline & refactor Change 3344846 on 2017/03/14 by Rolando.Caloca DR - Android compile fixes Change 3344883 on 2017/03/14 by Rolando.Caloca DR - Add missing stencil load/store to PSO initializer Change 3344985 on 2017/03/14 by Rolando.Caloca DR - Made load/store actions uint8 Change 3345141 on 2017/03/14 by Rolando.Caloca DR - vk - Rework render pass hash Change 3345304 on 2017/03/14 by Benjamin.Hyder Updating TM-Distancefields map to include TemplateFloor mesh Change 3345387 on 2017/03/14 by Rolando.Caloca DR - Add _RenderThread calls for Create*Shader so RHIs can choose not to stall when creating Change 3345388 on 2017/03/14 by Rolando.Caloca DR - Do not stall when creating shaders on Vulkan Change 3345722 on 2017/03/14 by Chris.Bunner PR #3357: MinimalAPI add to many material expressions (Contributed by DeanoC) #jira UE-42752 Change 3345723 on 2017/03/14 by Chris.Bunner Reduce log verbosity causing spamming during landscape editing. #jira UE-42714 Change 3345725 on 2017/03/14 by Chris.Bunner [Duplicate 3341860] Fixed material translation error with multiple connections from custom interpolator nodes. Change 3345726 on 2017/03/14 by Chris.Bunner Typo fixes. Change 3345732 on 2017/03/14 by Rolando.Caloca DR - Decouple vertex declaration off BSS Change 3345746 on 2017/03/14 by Chris.Bunner Added sign() intrinsic material graph node and delisted material function workaround. Change 3346042 on 2017/03/14 by Chris.Bunner Implement missing size query interface for FRenderTargetResources. #jira UE-41672 Change 3346387 on 2017/03/14 by Daniel.Wright [Copy] Added VolumetricScatteringIntensity to particle lights Change 3346389 on 2017/03/14 by Daniel.Wright [Copy] Clamp Volumetric material attributes to fp16 range to avoid INFs Disable volumetric fog when the fog show flag is disabled Change 3346392 on 2017/03/14 by Daniel.Wright [Copy] Fixed skylight being much too bright on volumetric fog Change 3346406 on 2017/03/14 by Daniel.Wright [Copy] CSM resolution is now controlled by r.Shadow.MaxCSMResolution. * Changed HighPC to use 1024 MaxShadowResolution (max for all non-CSM shadows), saves 60Mb in Fortnite Change 3346412 on 2017/03/14 by Daniel.Wright [Copy] TexCreate_ReduceMemoryWithTilingMode for translucency lighting 3d textures, saves 13Mb Change 3346414 on 2017/03/14 by Daniel.Wright [Copy] TexCreate_ReduceMemoryWithTilingMode for volumetric fog 3d textures, saves 13Mb Change 3346415 on 2017/03/14 by Daniel.Wright [Copy] Missing file from cl 3338451 Change 3346421 on 2017/03/14 by Daniel.Wright [Copy] Fixed NaNs in volumetric fog due to rendering when height fog is disabled * Volumetric fog converts NaNs to black now so they don't spread Change 3346422 on 2017/03/14 by Daniel.Wright [Copy] Fixed NaN in volumetric fog with low density values Change 3346423 on 2017/03/14 by Daniel.Wright [Copy] Changed default VolumetricFogScatteringDistribution to .2 Change 3346430 on 2017/03/14 by Daniel.Wright [Copy] New translucent material option to compute fog per pixel instead of the default per vertex Change 3346432 on 2017/03/14 by Daniel.Wright [Copy] Moved Volumetric Fog parameters to view uniform buffer for translucency pass Fixed lifetimes of temporary Volumetric Fog render targets Change 3346526 on 2017/03/14 by Daniel.Wright [Copy] Volumetric Fog supports point and spot light shadows * These lights are injected separately so that per-light resources can be bound (shadow depth map, static shadow depth map) * Forward lighting of local lights can be forced with 'r.VolumetricFog.InjectShadowedLightsSeparately 0' * Shadowed lights come at a cost: 2.9ms for volumetric fog on 970 -> 4.2ms with shadowing Change 3347053 on 2017/03/15 by Rolando.Caloca DR - android compile fix Change 3347384 on 2017/03/15 by Rolando.Caloca DR - Fix merge issue Change 3347643 on 2017/03/15 by Marcus.Wassmer Fix some bugs with the 'disable stationary skylight ffor the project' feature. Fixes lighting in Persona on Paragon. Change 3347979 on 2017/03/15 by Rolando.Caloca DR - Allow to automatically apply cached rendertargets to PSO initializer Change 3348024 on 2017/03/15 by Rolando.Caloca DR - Remove NullPS on Vulkan to avoid deadlock Change 3348303 on 2017/03/15 by Rolando.Caloca DR - Fix for debugging SCW with material SRT Change 3348357 on 2017/03/15 by Marcus.Wassmer Fix stencildither and a stencilref bug that was probably breaking decals sometimes. Change 3348549 on 2017/03/15 by Marcus.Wassmer Hopefully fix static analysis for potential nullptr access. Change 3348614 on 2017/03/15 by Marcus.Wassmer Duplicate some switch changes to fix crash on launch. Change 3349369 on 2017/03/16 by Gil.Gribb Fixed botched merge Change 3349947 on 2017/03/16 by Rolando.Caloca DR - Fix for mismatched primitive type Change 3349956 on 2017/03/16 by Benjamin.Hyder initial updates to TM-DistanceFields map Change 3350151 on 2017/03/16 by Rolando.Caloca DR - Fix UT compile issue Change 3350155 on 2017/03/16 by Rolando.Caloca DR - Catch mismatched primitive type on PSOs on D3D11 Change 3350192 on 2017/03/16 by Daniel.Wright Fix for point light shadow depths rendering with wrong cull mode due to PSO refactor Change 3350736 on 2017/03/16 by Daniel.Wright Fixed formatting from merge Change 3350881 on 2017/03/16 by Rolando.Caloca DR - Fix texture arrays as UAVs on Metal Change 3350927 on 2017/03/16 by Rolando.Caloca DR - Fix warning Change 3350935 on 2017/03/16 by Daniel.Wright Fix for materials with non-Surface domains being skipped in mesh passes Change 3351583 on 2017/03/17 by Marcus.Wassmer Fix clang platforms Change 3351917 on 2017/03/17 by Marcus.Wassmer Fix linux compile Change 3351973 on 2017/03/17 by Marcus.Wassmer Fix mismatched rendertargetformat Change 3352038 on 2017/03/17 by Daniel.Wright Enabled GetAndOrCreateGraphicsPipelineState ensures in Development for testing Change 3352110 on 2017/03/17 by Marcus.Wassmer Fix missing RT PSO apply Change 3352695 on 2017/03/17 by Arne.Schober DR - Remove PSO Rendertarget check in DX12 Resolve with Shader. #RB Rolando.Caloca Change 3352960 on 2017/03/17 by Arne.Schober DR - Fix some things that slipped trough the PSO merge #RB none Change 3353150 on 2017/03/18 by Rolando.Caloca DR - compile fix Change 3353205 on 2017/03/18 by Arne.Schober DR - Fix Incremental Compile and PS4 runtime error where CMASK is not allowed for ThickTile Mode #RB none Change 3353207 on 2017/03/18 by Arne.Schober DR - Fix Confusion #RB none Change 3355183 on 2017/03/20 by Nick.Bullard Fixed up Content orginzation for Decals automation tests in EngineTest Change 3355627 on 2017/03/20 by Arne.Schober DR - [UE-43094] - removed ensure in comporiton graph as control of the clear color cannot be gurantueed. Change 3356342 on 2017/03/21 by Marcus.Wassmer Fix clang errors Change 3356591 on 2017/03/21 by Arne.Schober DR - Fix ensure message #RB none Change 3356873 on 2017/03/21 by Arne.Schober DR - Fix comparission of undefined values in RendertargetApply Check Change 3357261 on 2017/03/21 by Marcus.Wassmer Fix LinuxEditor compile Change 3357294 on 2017/03/21 by Marcus.Wassmer Add missing SSE functions Change 3357351 on 2017/03/21 by Frank.Fella Fix win32 and linux compiler errors Change 3357370 on 2017/03/21 by Arne.Schober DR - disable ensure in test builds #RB Marcus.Wassmer [CL 3357449 by Marcus Wassmer in Main branch]
2017-03-21 17:46:52 -04:00
FGlobalShader::SetParameters<FViewUniformShaderParameters>(RHICmdList, ShaderRHI, View.ViewUniformBuffer);
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304) #lockdown Nick.Penwarden #rb none ========================== MAJOR FEATURES + CHANGES ========================== Change 3250856 on 2017/01/09 by Daniel.Wright Only showing instruction count for 'Base pass shader' now Change 3250943 on 2017/01/09 by Rolando.Caloca DR - Async Compute PSO creation Change 3251036 on 2017/01/09 by Rolando.Caloca DR - Add r.AsyncPipelineCompile - Dispatch on any thread - Wait for completion event Change 3251058 on 2017/01/09 by Ben.Woodhouse Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN) #jira UE-40332 Change 3251141 on 2017/01/09 by Ben.Woodhouse Duplicated from Fortnite CL 3243458: D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory. On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage. Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX Change 3251142 on 2017/01/09 by Ben.Woodhouse Duplicated from Fortnite 3243496 memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time. Change 3252323 on 2017/01/10 by Rolando.Caloca DR - Gfx async PSO creation prep Change 3252474 on 2017/01/10 by Daniel.Wright Added 'Compile Unreal Lightmass' to error message Change 3252589 on 2017/01/10 by Daniel.Wright Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite Change 3252790 on 2017/01/10 by Daniel.Wright Added InscatteringColorCubemapAngle to exponential height fog Change 3252843 on 2017/01/10 by Uriel.Doyon Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways. The bound defrag is now done outside of the async work logic. Change 3252866 on 2017/01/10 by Mark.Satterthwaite Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future. #jira UE-40357 Change 3254511 on 2017/01/11 by Rolando.Caloca DR - PSO stats Change 3255958 on 2017/01/12 by Mark.Satterthwaite Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed. #jira UE-40554 Change 3256329 on 2017/01/12 by Olaf.Piesche #jira UE-38615 Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime. Change 3256371 on 2017/01/12 by Uriel.Doyon Reenabled texture streaming bound defrag as the fix is in CL 3252843 Change 3257032 on 2017/01/13 by Daniel.Wright Added fastClamp to fastmath.usf Change 3257111 on 2017/01/13 by Daniel.Wright Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game Change 3257112 on 2017/01/13 by Daniel.Wright DFAO optimizations * Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms) * Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms) * Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms * Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms) Change 3257113 on 2017/01/13 by Daniel.Wright Better distance field memory stats Change 3257326 on 2017/01/13 by Uriel.Doyon Workaround to support cases where several textures have the same lighting GUID. Change 3257448 on 2017/01/13 by Daniel.Wright Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles Change 3257616 on 2017/01/13 by Daniel.Wright Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable Change 3257657 on 2017/01/13 by Daniel.Wright Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU * 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms Change 3258063 on 2017/01/14 by Rolando.Caloca DR - vk - Refactor descriptor set reuse in prep for more changes Change 3258715 on 2017/01/16 by Daniel.Wright Added VisualizeGlobalDistanceField show flag Change 3258827 on 2017/01/16 by Daniel.Wright Global distance field update regions are clipped against others to reduce redundant updates. Change 3258959 on 2017/01/16 by Benjamin.Hyder Updating Planar Reflection example material in TM-Shadermodels Change 3259270 on 2017/01/16 by Daniel.Wright [Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons. Change 3259652 on 2017/01/16 by Uriel.Doyon Better support for static primitive becoming dynamic. Change 3260107 on 2017/01/17 by Ben.Woodhouse Fix FMonitoredProcess to prevent infinite loop in -nothreading mode #jira UE-40717 Change 3260594 on 2017/01/17 by Daniel.Wright Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary) * The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field * Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same. * Adds 12Mb for the new volume textures Change 3260956 on 2017/01/17 by Daniel.Wright Structured buffers for DF object data * Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads Change 3261296 on 2017/01/17 by Daniel.Wright Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms Change 3262036 on 2017/01/18 by Ben.Salem V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details. Change 3262056 on 2017/01/18 by Chris.Bunner Remove inverse tonemapping when rendering HDR output. #jira UE-40728 Change 3262661 on 2017/01/18 by Rolando.Caloca DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs - Fix hash for PSOs Change 3263674 on 2017/01/19 by Chris.Bunner PR #3144: Improved error messages (Contributed by DarkSlot) #jira UE-40835 Change 3264150 on 2017/01/19 by Ben.Woodhouse Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode #jira UE-40841 Change 3264153 on 2017/01/19 by Ben.Woodhouse Integrate latest changes from MS-DX12 CLs 3231395-3262526 - Added WinPixEventRuntime.tps - Includes PIX support, various optimizations (saved 1.3ms in testbed scene) CL 3262343: Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes: - Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case. CL 3231395: Update D3D12 RHI: - Fix deferred MSAA path in RHI - Add Pix3.h support - Cleanup SetName usage and remove it from shipping builds. - Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value. - Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64 - Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version. - Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident. Change 3264251 on 2017/01/19 by Mark.Satterthwaite Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions. #jira UE-40803 Change 3264642 on 2017/01/19 by Daniel.Wright Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096. Change 3265330 on 2017/01/20 by Ben.Salem Stop performance plugin from building in Win32. #tests recompiled and preflighted Change 3265678 on 2017/01/20 by Marcus.Wassmer Fix bad declaration. #3055 Change 3266656 on 2017/01/20 by Mark.Satterthwaite Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging). Duplicate & amend CL #3266053 from Trepka: Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled. Amend & integrate RCO's CL #3197085. Change 3267741 on 2017/01/23 by Rolando.Caloca DR - Detect duplicated shader and pipeline types Change 3268600 on 2017/01/23 by Uriel.Doyon Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings. Integrated CL 3227368 from Orion stream Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months. Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing, Added th MaxEffectiveScreenSize settings in the investigate texture command. Change 3269512 on 2017/01/24 by Richard.Wallis Fix for shader binary cache uncompress data size during internal shader log. Change 3271237 on 2017/01/25 by Ben.Woodhouse D3D12 updateTexture2D crash fix #jira UE-41059 Change 3271564 on 2017/01/25 by Olaf.Piesche #jira UE-40980 #udn 325525 Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe Change 3271594 on 2017/01/25 by Ben.Woodhouse ESRAM support stage 1: Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range. Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR #fyi rolando.caloca Change 3272616 on 2017/01/25 by Rolando.Caloca DR - Update shader version Change 3273138 on 2017/01/26 by Ben.Woodhouse Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main) [CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
DeferredParameters.Set(RHICmdList, ShaderRHI, View);
SetTextureParameter(RHICmdList, ShaderRHI, VisualizeDistanceFieldTexture, VisualizeDistanceFieldSampler, TStaticSamplerState<SF_Bilinear>::GetRHI(), VisualizeDistanceField->GetRenderTargetItem().ShaderResourceTexture);
}
// FShader interface.
virtual bool Serialize(FArchive& Ar) override
{
bool bShaderHasOutdatedParameters = FGlobalShader::Serialize(Ar);
Ar << DeferredParameters;
Ar << VisualizeDistanceFieldTexture;
Ar << VisualizeDistanceFieldSampler;
return bShaderHasOutdatedParameters;
}
private:
FDeferredPixelShaderParameters DeferredParameters;
FShaderResourceParameter VisualizeDistanceFieldTexture;
FShaderResourceParameter VisualizeDistanceFieldSampler;
};
IMPLEMENT_SHADER_TYPE(,FVisualizeDistanceFieldUpsamplePS,TEXT("DistanceFieldVisualization"),TEXT("VisualizeDistanceFieldUpsamplePS"),SF_Pixel);
void FDeferredShadingSceneRenderer::RenderMeshDistanceFieldVisualization(FRHICommandListImmediate& RHICmdList, const FDistanceFieldAOParameters& Parameters)
{
//@todo - support multiple views
const FViewInfo& View = Views[0];
extern int32 GDistanceFieldAO;
if (GDistanceFieldAO
&& FeatureLevel >= ERHIFeatureLevel::SM5
&& DoesPlatformSupportDistanceFieldAO(View.GetShaderPlatform())
&& Views.Num() == 1)
{
QUICK_SCOPE_CYCLE_COUNTER(STAT_RenderMeshDistanceFieldVis);
SCOPED_DRAW_EVENT(RHICmdList, VisualizeMeshDistanceFields);
if (GDistanceFieldVolumeTextureAtlas.VolumeTextureRHI && Scene->DistanceFieldSceneData.NumObjectsInBuffer > 0)
{
check(!Scene->DistanceFieldSceneData.HasPendingOperations());
QUICK_SCOPE_CYCLE_COUNTER(STAT_AOIssueGPUWork);
const bool bUseGlobalDistanceField = UseGlobalDistanceField(Parameters) && View.Family->EngineShowFlags.VisualizeGlobalDistanceField;
CullObjectsToView(RHICmdList, Scene, View, Parameters, GAOCulledObjectBuffers);
TRefCountPtr<IPooledRenderTarget> VisualizeResultRT;
{
const FIntPoint BufferSize = GetBufferSizeForAO();
FPooledRenderTargetDesc Desc(FPooledRenderTargetDesc::Create2DDesc(BufferSize, PF_FloatRGBA, FClearValueBinding::None, TexCreate_None, TexCreate_RenderTargetable | TexCreate_UAV, false));
GRenderTargetPool.FindFreeElement(RHICmdList, Desc, VisualizeResultRT, TEXT("VisualizeDistanceField"));
}
{
SetRenderTarget(RHICmdList, NULL, NULL);
for (int32 ViewIndex = 0; ViewIndex < Views.Num(); ViewIndex++)
{
const FViewInfo& ViewInfo = Views[ViewIndex];
uint32 GroupSizeX = FMath::DivideAndRoundUp(ViewInfo.ViewRect.Size().X / GAODownsampleFactor, GDistanceFieldAOTileSizeX);
uint32 GroupSizeY = FMath::DivideAndRoundUp(ViewInfo.ViewRect.Size().Y / GAODownsampleFactor, GDistanceFieldAOTileSizeY);
SCOPED_DRAW_EVENT(RHICmdList, VisualizeMeshDistanceFieldCS);
FSceneRenderTargetItem& VisualizeResultRTI = VisualizeResultRT->GetRenderTargetItem();
if (bUseGlobalDistanceField)
{
check(View.GlobalDistanceFieldInfo.Clipmaps.Num() > 0);
TShaderMapRef<TVisualizeMeshDistanceFieldCS<true> > ComputeShader(ViewInfo.ShaderMap);
RHICmdList.SetComputeShader(ComputeShader->GetComputeShader());
ComputeShader->SetParameters(RHICmdList, ViewInfo, VisualizeResultRTI, FVector2D(GroupSizeX, GroupSizeY), Parameters, View.GlobalDistanceFieldInfo);
DispatchComputeShader(RHICmdList, *ComputeShader, GroupSizeX, GroupSizeY, 1);
ComputeShader->UnsetParameters(RHICmdList, VisualizeResultRTI);
}
else
{
TShaderMapRef<TVisualizeMeshDistanceFieldCS<false> > ComputeShader(ViewInfo.ShaderMap);
RHICmdList.SetComputeShader(ComputeShader->GetComputeShader());
ComputeShader->SetParameters(RHICmdList, ViewInfo, VisualizeResultRTI, FVector2D(GroupSizeX, GroupSizeY), Parameters, View.GlobalDistanceFieldInfo);
DispatchComputeShader(RHICmdList, *ComputeShader, GroupSizeX, GroupSizeY, 1);
ComputeShader->UnsetParameters(RHICmdList, VisualizeResultRTI);
}
}
}
{
FSceneRenderTargets::Get(RHICmdList).BeginRenderingSceneColor(RHICmdList, ESimpleRenderTargetMode::EExistingColorAndDepth, FExclusiveDepthStencil::DepthRead_StencilRead);
for (int32 ViewIndex = 0; ViewIndex < Views.Num(); ViewIndex++)
{
const FViewInfo& ViewInfo = Views[ViewIndex];
SCOPED_DRAW_EVENT(RHICmdList, UpsampleAO);
RHICmdList.SetViewport(ViewInfo.ViewRect.Min.X, ViewInfo.ViewRect.Min.Y, 0.0f, ViewInfo.ViewRect.Max.X, ViewInfo.ViewRect.Max.Y, 1.0f);
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3357411) #lockdown Nick.Penwarden #rb none ========================== MAJOR FEATURES + CHANGES ========================== Change 3244756 on 2017/01/03 by Marcus.Wassmer Copying //Tasks/UE4/Dev-Niagara@3244743 to Dev-Rendering (//UE4/Dev-Rendering) Change 3248667 on 2017/01/05 by Olaf.Piesche Resaving default asset because of engine verison issue; maybe unnecessary, but resaving niagara engine content to be sure #jira UE-40160 Change 3249324 on 2017/01/06 by Marcus.Wassmer Resave with an actual version to stop cook warning Change 3249611 on 2017/01/06 by Marcus.Wassmer Just remove warning-causing niagara data for now. Change 3308052 on 2017/02/16 by Rolando.Caloca DR - Check for Vulkan SDK, and only use it if it's newer or the same as the headers we distribute Change 3308109 on 2017/02/16 by Rolando.Caloca DR - Upgrade glslang to 1.0.39.1 Change 3308111 on 2017/02/16 by Rolando.Caloca DR - Update Vulkan distribution to 1.0.39.1 Change 3308153 on 2017/02/16 by Rolando.Caloca DR - Updated glslang libs Change 3308842 on 2017/02/17 by Rolando.Caloca DR - Fixed copy/paste Change 3310007 on 2017/02/17 by Chris.Bunner Back out CL 3221219 - causing MIC generation issues and superseded by CL 3273971. #jira UE-37792 Change 3310154 on 2017/02/17 by Chris.Bunner Assert when attempting to add a custom material attribute already in the base attributes list. Change 3310155 on 2017/02/17 by Chris.Bunner PR #3231: Validate material index before accessing (Contributed by projectgheist) #jira UE-41774, UE-41788 Change 3310162 on 2017/02/17 by Chris.Bunner PR #3252: Added MobileMaterialInterface to UsedMaterials (Contributed by projectgheist) #jira UE-41823, UE-41950 Change 3310176 on 2017/02/17 by Chris.Bunner Merging CL 3233886: AMD HDR support (requires r.AMDSupportsHDRDisplayOutput=1 in ini). Update to AGS 5.0.5. Partial code tidy up. Change 3310187 on 2017/02/17 by Chris.Bunner Preserve constant expressions rather than always casting after translating a material attribute. Losing the notion of constant means we can't correctly detect used properties and falsely enable e.g. PDO. Happened because of the incorrect component masks in BreakMaterialNodes which then had to be downcast to the correct type which is done as an inline fragment rather than swizzle expression. #jira UE-41594 Change 3310215 on 2017/02/17 by Chris.Bunner Prevent SpeedTree node compiling for skeletal meshes (not supported as uses more UV sets than available). More descriptive error for missing Cubemap UV input on TextureSample material node . #jira UE-33098 Change 3310838 on 2017/02/18 by Joe.Graf Moved some private functions to public for a licensee #CodeReview: matt.kuhlenschmidt #rb: n/a Change 3311876 on 2017/02/20 by Rolando.Caloca DR - Expose skin cache cvar r.SkinCache.AccumulationBufferSizeInKB #jira UE-42014 Change 3314139 on 2017/02/21 by Rolando.Caloca DR - Minor cleanup pass - Remove FVulkanPendingState - Renamed some classes for clarity - Hoist pending UAVs for flush out to pending compute state Change 3314642 on 2017/02/21 by Rolando.Caloca DR - Some more renaming Change 3315431 on 2017/02/21 by Ben.Salem Properly set default values for test time out and tick. We now will default to ticking once per second, and tracking the macro stats of GPU/Render/Game thread time. #tests Ran showdown demo several times Change 3316710 on 2017/02/22 by Rolando.Caloca DR - hlslcc - Fix refract intrinsic Change 3316718 on 2017/02/22 by Rolando.Caloca DR - hlslcc - Built libs to pick up change from 3316710 - refract fix Change 3316820 on 2017/02/22 by Benjamin.Hyder updating Tm-TrigNodes map Change 3317192 on 2017/02/22 by Benjamin.Hyder Updating QA-Decals map Change 3317528 on 2017/02/22 by Benjamin.Hyder Updating QA-Decals map Change 3317639 on 2017/02/22 by Benjamin.Hyder Updating Decal on Complex Mesh example in QA-Decals Change 3317764 on 2017/02/22 by Benjamin.Hyder Final updates to QA-Decals Change 3318319 on 2017/02/22 by Rolando.Caloca DR - minor reorg/rename Change 3318379 on 2017/02/22 by Rolando.Caloca DR - more cleanup Change 3321181 on 2017/02/24 by Rolando.Caloca DR - Fix GL bug Change 3321247 on 2017/02/24 by Rolando.Caloca DR - Fix misc bugs Change 3321898 on 2017/02/24 by Chris.Bunner Only issue clear TLV dispatch if required. #jira UERNDR-193 Change 3321904 on 2017/02/24 by Chris.Bunner Added comment for potential future optimization. Change 3322013 on 2017/02/24 by Uriel.Doyon Fixed separate translucency being affected by Gaussian DOF #jira UE-40489 Change 3322517 on 2017/02/24 by Uriel.Doyon Fixed issue with InvestigateTexture command removing budget limit. Fixed StreamingBounds show flag not working. It nows shows the streaming bound for the currently selected textures. #jira UE-40485 Change 3323470 on 2017/02/27 by Chad.Garyet Removing DDC job from dev-rendering Change 3323479 on 2017/02/27 by Chad.Garyet Removing RDU agent type Change 3323519 on 2017/02/27 by Chad.Garyet removing NCL/LHR/SEA agent types to clean up space Change 3323639 on 2017/02/27 by Benjamin.Hyder More updates to QA-Decals Change 3324207 on 2017/02/27 by Uriel.Doyon Fixed typo ScaleTexturesByGlobalMyBias -> ScaleTexturesByGlobalMipBias Removed bad merge in FStreamingTextureLevelContext::GetBuildDataIndexRef Change 3324396 on 2017/02/27 by Uriel.Doyon Fixed an issue with the Streaming Bounds show flag interferring with the static level data initialization #jira UE-40485 Change 3325227 on 2017/02/28 by Chris.Bunner Fix-up AMD AGS libs. Change 3325566 on 2017/02/28 by Uriel.Doyon Fixed possible out-of-bound access in GetUsedTexture() when passing ERHIFeatureLevel::Num Change 3326009 on 2017/02/28 by Uriel.Doyon Better fix for 3325566, as the previous fix would ignore the material instance overrides. Change 3327058 on 2017/03/01 by Benjamin.Hyder Preparing TM_Shadermodels map for automation Change 3328222 on 2017/03/01 by Chris.Bunner Prevent decals from drawing in separate translucency pass. Whilst user control and material relevance were already removed, if the flag was checked before being disabled (by swapping to decal domain) this was still being read in the render loop, now explicitly ignores decals. #jira UE-42449, UE-42446 Change 3329848 on 2017/03/02 by Uriel.Doyon Added some extra logs to help track UE-42168 Change 3329977 on 2017/03/02 by Rolando.Caloca DR - Fix bad clear value Change 3330008 on 2017/03/02 by Benjamin.Hyder More preparations for QA-Decals automation Change 3330754 on 2017/03/02 by Daniel.Wright Prominent comment explaining reflection env async compute usage and why it's not overlapped with anything Change 3331451 on 2017/03/03 by Marc.Olano Manually unroll simplex noise loop to avoid PSO bug on AMD/Metal Change 3331839 on 2017/03/03 by Rolando.Caloca DR - hlslcc - add missing file to project Change 3332247 on 2017/03/03 by Rolando.Caloca DR - Fix for integrated intel PR #3305 #jira UE-42393 Change 3332259 on 2017/03/03 by Rolando.Caloca DR - Fix bad index into pixel formats PR #3237 #jira UE-41855 Change 3332305 on 2017/03/03 by Rolando.Caloca DR - OpenGL SRV for index buffers PR #3271 #jira UE-32618 Change 3332313 on 2017/03/03 by Rolando.Caloca DR - Fix for integrated intel (properly) PR #3305 #jira UE-42393 Change 3332317 on 2017/03/03 by Rolando.Caloca DR - OpenGL SRV for index buffers (properly) PR #3271 #jira UE-32618 Change 3332368 on 2017/03/03 by Rolando.Caloca DR - Minor fixes so -sm4 and -sm5 can be used on windows with OpenGL/Vulkan Change 3333690 on 2017/03/06 by Daniel.Wright [Copy] Changing movable skylight properties no longer affects static draw lists Change 3333693 on 2017/03/06 by Daniel.Wright [Copy] Added 'r.AOListMeshDistanceFields' which dumps out mesh distance fields sorted by memory size, useful for directing content optimizations Change 3333705 on 2017/03/06 by Daniel.Wright [Copy] Mesh distance fields are now 8 bit fixed point by default, but can be changed back to 16 bit floating piont with a project setting. * 8 bit uses half memory but introduces error for thin surfaces or large meshes. Change 3333721 on 2017/03/06 by David.Hill DecalProxy: Copy float FadeScreenSize to FDeferredDecalProxy for use in the render thread. This avoids pointer chasing to the UDecalComponent (game thread component). Change 3333772 on 2017/03/06 by Daniel.Wright [Copy] Scene motion blur data is only updated for the main renderer frames. Fixes scene captures and planar reflections breaking object motion blur. Change 3333790 on 2017/03/06 by Daniel.Wright [Copy] Mesh distance field generation uses Embree, for a 2.5x speedup * Can switch back to old kDOP generation with 'r.DistanceFieldBuild.UseEmbree 0' for debugging Change 3333822 on 2017/03/06 by Daniel.Wright [Copy] Moved mesh distance field code into MeshDistanceFieldUtilities.cpp * Moved FMeshUtilities to its own header so the 8k line MeshUtilites.cpp file can be further split up Change 3333827 on 2017/03/06 by Daniel.Wright [Copy] Range compress 8bit distance fields - gets one extra bit of precision on average Change 3333828 on 2017/03/06 by Daniel.Wright [Copy] Raised High ShadowQuality to 2048 as 1024 for CSM is way too low Change 3333831 on 2017/03/06 by Daniel.Wright Non-editor compile fix Change 3333836 on 2017/03/06 by Daniel.Wright [Copy] Workaround for gobal distance field volume textures being bloated by 4x on PS4 due to the recommended tiling modes. They now use a 2d tiling mode which avoids the bloat, saving 96Mb. Change 3333843 on 2017/03/06 by Daniel.Wright [Copy] Added OcclusionExponent to skylight component * Useful for brightening up indoors without losing contact shadows as MinOcclusion does Change 3333845 on 2017/03/06 by Daniel.Wright [Copy] Capsule shadow BP functions Change 3333850 on 2017/03/06 by Daniel.Wright [Copy] Added OcclusionCombineMode to skylight component Change 3333854 on 2017/03/06 by Daniel.Wright [Copy] Gnm properly registers clears as GPU work so those events show up in profilegpu Change 3333857 on 2017/03/06 by Daniel.Wright [Copy] Clear light attenuation for local lights with a quad covering their screen extents * Clearing the entire light attenuation buffer costs .1ms on PS4. This optimization lowers the minimum cost of a shadow casting light from .15ms -> .03ms. * Shadowed lights in Fortnite with 25 lights 3.7ms -> 1.42ms on PS4 Change 3333860 on 2017/03/06 by Daniel.Wright [Copy] Flush deferred deletes when reallocating distance field atlas to reduce peak memory Change 3333861 on 2017/03/06 by Daniel.Wright [Copy] Disable all distance field features on Intel cards as HD 4000 hangs in the RHICreateTexture3D call to allocate the large atlas Change 3333869 on 2017/03/06 by Daniel.Wright [Copy] Volumetric Fog using a volume texture mapped to the camera frustum * Volumetric fog can be enabled on an Exponential Height Fog component with additional controls * Lights have a VolumetricScatteringIntensity * New cvars r.VolumetricFog, r.VolumetricFog.GridPixelSize, r.VolumetricFog.GridSizeZ, r.VolumetricFog.DepthDistributionScale * Lighting features supported: * Directional light with CSM and a light function * Point / spot lights without shadows / light functions / IES profiles * Skylight with occlusion from distance fields * Analytical height fog covers the view range past where the volumetric fog ends * Temporal reprojection is used on the volumetric fog scattering and extinction to achieve stability * Translucency integrates properly into volumetric fog * Height fog StartDistance is not supported by volumetric fog and should be set to 0. Change 3333894 on 2017/03/06 by Daniel.Wright [Copy] Initialize GDummyVolumetricFogGlobalDataUniformBuffer outside of parallel rendering Change 3333902 on 2017/03/06 by Daniel.Wright [Copy] Better handling of volumetric fog enabled with distance of 0 Change 3333903 on 2017/03/06 by Daniel.Wright [Copy] Fixed volumetric fog trying to render light functions for a point light Change 3333908 on 2017/03/06 by Daniel.Wright [Copy] Volumetric materials * Added new material domain Volume, which can output Scattering, Absorption and Emissive. All properties are in world space densities. * Particle systems using the Volume domain are voxelized based on their ParticlePosition and ParticleRadius * Volumetric fog integration is now energy conservative - scattering is integrated against transmission over the depth of each slice. * Added bOverrideLightColorsWithFogInscatteringColors to exponential height fog, which can be enabled to make Volumetric Fog match Height fog more closely Change 3334134 on 2017/03/06 by Daniel.Wright [Copy from Michael Trepka] Added Embree 2.14.0 and changed MeshUtilities to use it as this solves issues with Embree leaking TLS keys. UnrealLightmass is still using older Embree 2.7.0 until we can find time to properly test it with the new version. Also, invalidated distance field DDC to force it to rebuild with updated Embree. Change 3334420 on 2017/03/06 by Daniel.Wright Fixed RTDF shadows Change 3335467 on 2017/03/07 by Benjamin.Hyder Initial submission of QA-Decals map to EngineTest Change 3335556 on 2017/03/07 by Daniel.Wright Changed mesh distance field default format back to R16f Change 3338020 on 2017/03/08 by Daniel.Wright Disable volumetric fog in vertex shaders for feature levels which don't support it Change 3339394 on 2017/03/09 by Chris.Bunner Correctly handle material texture translation error edge case. #jira UE-42579, UE-42670 Change 3339992 on 2017/03/09 by Daniel.Wright Only compile volumetric fog shaders on supporting platforms Change 3341858 on 2017/03/10 by Arne.Schober Copying //UE4/Dev-Rendering-PSO to Dev-Rendering (//UE4/Dev-Rendering) #RB Rolando.Caloca, Marcus.Wassmer, Daniel.Wright, Nick.Penwarden, Mark.Satterthwaite Change 3342004 on 2017/03/10 by Arne.Schober Copying //UE4/Dev-Rendering-PSO to Dev-Rendering (//UE4/Dev-Rendering) Fix unity build #RB Marcus.Wassmer Change 3343307 on 2017/03/13 by Marcus.Wassmer Update showflags when we are guaranteed it will happen in all possible ways to spawn the scenecapture. (drag into editor, PIE, -game, etc) Change 3343732 on 2017/03/13 by Rolando.Caloca DR - Vulkan compute pipeline & refactor Change 3344846 on 2017/03/14 by Rolando.Caloca DR - Android compile fixes Change 3344883 on 2017/03/14 by Rolando.Caloca DR - Add missing stencil load/store to PSO initializer Change 3344985 on 2017/03/14 by Rolando.Caloca DR - Made load/store actions uint8 Change 3345141 on 2017/03/14 by Rolando.Caloca DR - vk - Rework render pass hash Change 3345304 on 2017/03/14 by Benjamin.Hyder Updating TM-Distancefields map to include TemplateFloor mesh Change 3345387 on 2017/03/14 by Rolando.Caloca DR - Add _RenderThread calls for Create*Shader so RHIs can choose not to stall when creating Change 3345388 on 2017/03/14 by Rolando.Caloca DR - Do not stall when creating shaders on Vulkan Change 3345722 on 2017/03/14 by Chris.Bunner PR #3357: MinimalAPI add to many material expressions (Contributed by DeanoC) #jira UE-42752 Change 3345723 on 2017/03/14 by Chris.Bunner Reduce log verbosity causing spamming during landscape editing. #jira UE-42714 Change 3345725 on 2017/03/14 by Chris.Bunner [Duplicate 3341860] Fixed material translation error with multiple connections from custom interpolator nodes. Change 3345726 on 2017/03/14 by Chris.Bunner Typo fixes. Change 3345732 on 2017/03/14 by Rolando.Caloca DR - Decouple vertex declaration off BSS Change 3345746 on 2017/03/14 by Chris.Bunner Added sign() intrinsic material graph node and delisted material function workaround. Change 3346042 on 2017/03/14 by Chris.Bunner Implement missing size query interface for FRenderTargetResources. #jira UE-41672 Change 3346387 on 2017/03/14 by Daniel.Wright [Copy] Added VolumetricScatteringIntensity to particle lights Change 3346389 on 2017/03/14 by Daniel.Wright [Copy] Clamp Volumetric material attributes to fp16 range to avoid INFs Disable volumetric fog when the fog show flag is disabled Change 3346392 on 2017/03/14 by Daniel.Wright [Copy] Fixed skylight being much too bright on volumetric fog Change 3346406 on 2017/03/14 by Daniel.Wright [Copy] CSM resolution is now controlled by r.Shadow.MaxCSMResolution. * Changed HighPC to use 1024 MaxShadowResolution (max for all non-CSM shadows), saves 60Mb in Fortnite Change 3346412 on 2017/03/14 by Daniel.Wright [Copy] TexCreate_ReduceMemoryWithTilingMode for translucency lighting 3d textures, saves 13Mb Change 3346414 on 2017/03/14 by Daniel.Wright [Copy] TexCreate_ReduceMemoryWithTilingMode for volumetric fog 3d textures, saves 13Mb Change 3346415 on 2017/03/14 by Daniel.Wright [Copy] Missing file from cl 3338451 Change 3346421 on 2017/03/14 by Daniel.Wright [Copy] Fixed NaNs in volumetric fog due to rendering when height fog is disabled * Volumetric fog converts NaNs to black now so they don't spread Change 3346422 on 2017/03/14 by Daniel.Wright [Copy] Fixed NaN in volumetric fog with low density values Change 3346423 on 2017/03/14 by Daniel.Wright [Copy] Changed default VolumetricFogScatteringDistribution to .2 Change 3346430 on 2017/03/14 by Daniel.Wright [Copy] New translucent material option to compute fog per pixel instead of the default per vertex Change 3346432 on 2017/03/14 by Daniel.Wright [Copy] Moved Volumetric Fog parameters to view uniform buffer for translucency pass Fixed lifetimes of temporary Volumetric Fog render targets Change 3346526 on 2017/03/14 by Daniel.Wright [Copy] Volumetric Fog supports point and spot light shadows * These lights are injected separately so that per-light resources can be bound (shadow depth map, static shadow depth map) * Forward lighting of local lights can be forced with 'r.VolumetricFog.InjectShadowedLightsSeparately 0' * Shadowed lights come at a cost: 2.9ms for volumetric fog on 970 -> 4.2ms with shadowing Change 3347053 on 2017/03/15 by Rolando.Caloca DR - android compile fix Change 3347384 on 2017/03/15 by Rolando.Caloca DR - Fix merge issue Change 3347643 on 2017/03/15 by Marcus.Wassmer Fix some bugs with the 'disable stationary skylight ffor the project' feature. Fixes lighting in Persona on Paragon. Change 3347979 on 2017/03/15 by Rolando.Caloca DR - Allow to automatically apply cached rendertargets to PSO initializer Change 3348024 on 2017/03/15 by Rolando.Caloca DR - Remove NullPS on Vulkan to avoid deadlock Change 3348303 on 2017/03/15 by Rolando.Caloca DR - Fix for debugging SCW with material SRT Change 3348357 on 2017/03/15 by Marcus.Wassmer Fix stencildither and a stencilref bug that was probably breaking decals sometimes. Change 3348549 on 2017/03/15 by Marcus.Wassmer Hopefully fix static analysis for potential nullptr access. Change 3348614 on 2017/03/15 by Marcus.Wassmer Duplicate some switch changes to fix crash on launch. Change 3349369 on 2017/03/16 by Gil.Gribb Fixed botched merge Change 3349947 on 2017/03/16 by Rolando.Caloca DR - Fix for mismatched primitive type Change 3349956 on 2017/03/16 by Benjamin.Hyder initial updates to TM-DistanceFields map Change 3350151 on 2017/03/16 by Rolando.Caloca DR - Fix UT compile issue Change 3350155 on 2017/03/16 by Rolando.Caloca DR - Catch mismatched primitive type on PSOs on D3D11 Change 3350192 on 2017/03/16 by Daniel.Wright Fix for point light shadow depths rendering with wrong cull mode due to PSO refactor Change 3350736 on 2017/03/16 by Daniel.Wright Fixed formatting from merge Change 3350881 on 2017/03/16 by Rolando.Caloca DR - Fix texture arrays as UAVs on Metal Change 3350927 on 2017/03/16 by Rolando.Caloca DR - Fix warning Change 3350935 on 2017/03/16 by Daniel.Wright Fix for materials with non-Surface domains being skipped in mesh passes Change 3351583 on 2017/03/17 by Marcus.Wassmer Fix clang platforms Change 3351917 on 2017/03/17 by Marcus.Wassmer Fix linux compile Change 3351973 on 2017/03/17 by Marcus.Wassmer Fix mismatched rendertargetformat Change 3352038 on 2017/03/17 by Daniel.Wright Enabled GetAndOrCreateGraphicsPipelineState ensures in Development for testing Change 3352110 on 2017/03/17 by Marcus.Wassmer Fix missing RT PSO apply Change 3352695 on 2017/03/17 by Arne.Schober DR - Remove PSO Rendertarget check in DX12 Resolve with Shader. #RB Rolando.Caloca Change 3352960 on 2017/03/17 by Arne.Schober DR - Fix some things that slipped trough the PSO merge #RB none Change 3353150 on 2017/03/18 by Rolando.Caloca DR - compile fix Change 3353205 on 2017/03/18 by Arne.Schober DR - Fix Incremental Compile and PS4 runtime error where CMASK is not allowed for ThickTile Mode #RB none Change 3353207 on 2017/03/18 by Arne.Schober DR - Fix Confusion #RB none Change 3355183 on 2017/03/20 by Nick.Bullard Fixed up Content orginzation for Decals automation tests in EngineTest Change 3355627 on 2017/03/20 by Arne.Schober DR - [UE-43094] - removed ensure in comporiton graph as control of the clear color cannot be gurantueed. Change 3356342 on 2017/03/21 by Marcus.Wassmer Fix clang errors Change 3356591 on 2017/03/21 by Arne.Schober DR - Fix ensure message #RB none Change 3356873 on 2017/03/21 by Arne.Schober DR - Fix comparission of undefined values in RendertargetApply Check Change 3357261 on 2017/03/21 by Marcus.Wassmer Fix LinuxEditor compile Change 3357294 on 2017/03/21 by Marcus.Wassmer Add missing SSE functions Change 3357351 on 2017/03/21 by Frank.Fella Fix win32 and linux compiler errors Change 3357370 on 2017/03/21 by Arne.Schober DR - disable ensure in test builds #RB Marcus.Wassmer [CL 3357449 by Marcus Wassmer in Main branch]
2017-03-21 17:46:52 -04:00
FGraphicsPipelineStateInitializer GraphicsPSOInit;
RHICmdList.ApplyCachedRenderTargets(GraphicsPSOInit);
GraphicsPSOInit.RasterizerState = TStaticRasterizerState<FM_Solid, CM_None>::GetRHI();
GraphicsPSOInit.DepthStencilState = TStaticDepthStencilState<false, CF_Always>::GetRHI();
GraphicsPSOInit.BlendState = TStaticBlendState<>::GetRHI();
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304) #lockdown Nick.Penwarden #rb none ========================== MAJOR FEATURES + CHANGES ========================== Change 3250856 on 2017/01/09 by Daniel.Wright Only showing instruction count for 'Base pass shader' now Change 3250943 on 2017/01/09 by Rolando.Caloca DR - Async Compute PSO creation Change 3251036 on 2017/01/09 by Rolando.Caloca DR - Add r.AsyncPipelineCompile - Dispatch on any thread - Wait for completion event Change 3251058 on 2017/01/09 by Ben.Woodhouse Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN) #jira UE-40332 Change 3251141 on 2017/01/09 by Ben.Woodhouse Duplicated from Fortnite CL 3243458: D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory. On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage. Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX Change 3251142 on 2017/01/09 by Ben.Woodhouse Duplicated from Fortnite 3243496 memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time. Change 3252323 on 2017/01/10 by Rolando.Caloca DR - Gfx async PSO creation prep Change 3252474 on 2017/01/10 by Daniel.Wright Added 'Compile Unreal Lightmass' to error message Change 3252589 on 2017/01/10 by Daniel.Wright Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite Change 3252790 on 2017/01/10 by Daniel.Wright Added InscatteringColorCubemapAngle to exponential height fog Change 3252843 on 2017/01/10 by Uriel.Doyon Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways. The bound defrag is now done outside of the async work logic. Change 3252866 on 2017/01/10 by Mark.Satterthwaite Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future. #jira UE-40357 Change 3254511 on 2017/01/11 by Rolando.Caloca DR - PSO stats Change 3255958 on 2017/01/12 by Mark.Satterthwaite Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed. #jira UE-40554 Change 3256329 on 2017/01/12 by Olaf.Piesche #jira UE-38615 Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime. Change 3256371 on 2017/01/12 by Uriel.Doyon Reenabled texture streaming bound defrag as the fix is in CL 3252843 Change 3257032 on 2017/01/13 by Daniel.Wright Added fastClamp to fastmath.usf Change 3257111 on 2017/01/13 by Daniel.Wright Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game Change 3257112 on 2017/01/13 by Daniel.Wright DFAO optimizations * Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms) * Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms) * Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms * Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms) Change 3257113 on 2017/01/13 by Daniel.Wright Better distance field memory stats Change 3257326 on 2017/01/13 by Uriel.Doyon Workaround to support cases where several textures have the same lighting GUID. Change 3257448 on 2017/01/13 by Daniel.Wright Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles Change 3257616 on 2017/01/13 by Daniel.Wright Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable Change 3257657 on 2017/01/13 by Daniel.Wright Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU * 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms Change 3258063 on 2017/01/14 by Rolando.Caloca DR - vk - Refactor descriptor set reuse in prep for more changes Change 3258715 on 2017/01/16 by Daniel.Wright Added VisualizeGlobalDistanceField show flag Change 3258827 on 2017/01/16 by Daniel.Wright Global distance field update regions are clipped against others to reduce redundant updates. Change 3258959 on 2017/01/16 by Benjamin.Hyder Updating Planar Reflection example material in TM-Shadermodels Change 3259270 on 2017/01/16 by Daniel.Wright [Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons. Change 3259652 on 2017/01/16 by Uriel.Doyon Better support for static primitive becoming dynamic. Change 3260107 on 2017/01/17 by Ben.Woodhouse Fix FMonitoredProcess to prevent infinite loop in -nothreading mode #jira UE-40717 Change 3260594 on 2017/01/17 by Daniel.Wright Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary) * The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field * Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same. * Adds 12Mb for the new volume textures Change 3260956 on 2017/01/17 by Daniel.Wright Structured buffers for DF object data * Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads Change 3261296 on 2017/01/17 by Daniel.Wright Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms Change 3262036 on 2017/01/18 by Ben.Salem V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details. Change 3262056 on 2017/01/18 by Chris.Bunner Remove inverse tonemapping when rendering HDR output. #jira UE-40728 Change 3262661 on 2017/01/18 by Rolando.Caloca DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs - Fix hash for PSOs Change 3263674 on 2017/01/19 by Chris.Bunner PR #3144: Improved error messages (Contributed by DarkSlot) #jira UE-40835 Change 3264150 on 2017/01/19 by Ben.Woodhouse Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode #jira UE-40841 Change 3264153 on 2017/01/19 by Ben.Woodhouse Integrate latest changes from MS-DX12 CLs 3231395-3262526 - Added WinPixEventRuntime.tps - Includes PIX support, various optimizations (saved 1.3ms in testbed scene) CL 3262343: Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes: - Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case. CL 3231395: Update D3D12 RHI: - Fix deferred MSAA path in RHI - Add Pix3.h support - Cleanup SetName usage and remove it from shipping builds. - Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value. - Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64 - Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version. - Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident. Change 3264251 on 2017/01/19 by Mark.Satterthwaite Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions. #jira UE-40803 Change 3264642 on 2017/01/19 by Daniel.Wright Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096. Change 3265330 on 2017/01/20 by Ben.Salem Stop performance plugin from building in Win32. #tests recompiled and preflighted Change 3265678 on 2017/01/20 by Marcus.Wassmer Fix bad declaration. #3055 Change 3266656 on 2017/01/20 by Mark.Satterthwaite Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging). Duplicate & amend CL #3266053 from Trepka: Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled. Amend & integrate RCO's CL #3197085. Change 3267741 on 2017/01/23 by Rolando.Caloca DR - Detect duplicated shader and pipeline types Change 3268600 on 2017/01/23 by Uriel.Doyon Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings. Integrated CL 3227368 from Orion stream Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months. Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing, Added th MaxEffectiveScreenSize settings in the investigate texture command. Change 3269512 on 2017/01/24 by Richard.Wallis Fix for shader binary cache uncompress data size during internal shader log. Change 3271237 on 2017/01/25 by Ben.Woodhouse D3D12 updateTexture2D crash fix #jira UE-41059 Change 3271564 on 2017/01/25 by Olaf.Piesche #jira UE-40980 #udn 325525 Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe Change 3271594 on 2017/01/25 by Ben.Woodhouse ESRAM support stage 1: Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range. Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR #fyi rolando.caloca Change 3272616 on 2017/01/25 by Rolando.Caloca DR - Update shader version Change 3273138 on 2017/01/26 by Ben.Woodhouse Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main) [CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
TShaderMapRef<FPostProcessVS> VertexShader( ViewInfo.ShaderMap );
TShaderMapRef<FVisualizeDistanceFieldUpsamplePS> PixelShader( ViewInfo.ShaderMap );
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3357411) #lockdown Nick.Penwarden #rb none ========================== MAJOR FEATURES + CHANGES ========================== Change 3244756 on 2017/01/03 by Marcus.Wassmer Copying //Tasks/UE4/Dev-Niagara@3244743 to Dev-Rendering (//UE4/Dev-Rendering) Change 3248667 on 2017/01/05 by Olaf.Piesche Resaving default asset because of engine verison issue; maybe unnecessary, but resaving niagara engine content to be sure #jira UE-40160 Change 3249324 on 2017/01/06 by Marcus.Wassmer Resave with an actual version to stop cook warning Change 3249611 on 2017/01/06 by Marcus.Wassmer Just remove warning-causing niagara data for now. Change 3308052 on 2017/02/16 by Rolando.Caloca DR - Check for Vulkan SDK, and only use it if it's newer or the same as the headers we distribute Change 3308109 on 2017/02/16 by Rolando.Caloca DR - Upgrade glslang to 1.0.39.1 Change 3308111 on 2017/02/16 by Rolando.Caloca DR - Update Vulkan distribution to 1.0.39.1 Change 3308153 on 2017/02/16 by Rolando.Caloca DR - Updated glslang libs Change 3308842 on 2017/02/17 by Rolando.Caloca DR - Fixed copy/paste Change 3310007 on 2017/02/17 by Chris.Bunner Back out CL 3221219 - causing MIC generation issues and superseded by CL 3273971. #jira UE-37792 Change 3310154 on 2017/02/17 by Chris.Bunner Assert when attempting to add a custom material attribute already in the base attributes list. Change 3310155 on 2017/02/17 by Chris.Bunner PR #3231: Validate material index before accessing (Contributed by projectgheist) #jira UE-41774, UE-41788 Change 3310162 on 2017/02/17 by Chris.Bunner PR #3252: Added MobileMaterialInterface to UsedMaterials (Contributed by projectgheist) #jira UE-41823, UE-41950 Change 3310176 on 2017/02/17 by Chris.Bunner Merging CL 3233886: AMD HDR support (requires r.AMDSupportsHDRDisplayOutput=1 in ini). Update to AGS 5.0.5. Partial code tidy up. Change 3310187 on 2017/02/17 by Chris.Bunner Preserve constant expressions rather than always casting after translating a material attribute. Losing the notion of constant means we can't correctly detect used properties and falsely enable e.g. PDO. Happened because of the incorrect component masks in BreakMaterialNodes which then had to be downcast to the correct type which is done as an inline fragment rather than swizzle expression. #jira UE-41594 Change 3310215 on 2017/02/17 by Chris.Bunner Prevent SpeedTree node compiling for skeletal meshes (not supported as uses more UV sets than available). More descriptive error for missing Cubemap UV input on TextureSample material node . #jira UE-33098 Change 3310838 on 2017/02/18 by Joe.Graf Moved some private functions to public for a licensee #CodeReview: matt.kuhlenschmidt #rb: n/a Change 3311876 on 2017/02/20 by Rolando.Caloca DR - Expose skin cache cvar r.SkinCache.AccumulationBufferSizeInKB #jira UE-42014 Change 3314139 on 2017/02/21 by Rolando.Caloca DR - Minor cleanup pass - Remove FVulkanPendingState - Renamed some classes for clarity - Hoist pending UAVs for flush out to pending compute state Change 3314642 on 2017/02/21 by Rolando.Caloca DR - Some more renaming Change 3315431 on 2017/02/21 by Ben.Salem Properly set default values for test time out and tick. We now will default to ticking once per second, and tracking the macro stats of GPU/Render/Game thread time. #tests Ran showdown demo several times Change 3316710 on 2017/02/22 by Rolando.Caloca DR - hlslcc - Fix refract intrinsic Change 3316718 on 2017/02/22 by Rolando.Caloca DR - hlslcc - Built libs to pick up change from 3316710 - refract fix Change 3316820 on 2017/02/22 by Benjamin.Hyder updating Tm-TrigNodes map Change 3317192 on 2017/02/22 by Benjamin.Hyder Updating QA-Decals map Change 3317528 on 2017/02/22 by Benjamin.Hyder Updating QA-Decals map Change 3317639 on 2017/02/22 by Benjamin.Hyder Updating Decal on Complex Mesh example in QA-Decals Change 3317764 on 2017/02/22 by Benjamin.Hyder Final updates to QA-Decals Change 3318319 on 2017/02/22 by Rolando.Caloca DR - minor reorg/rename Change 3318379 on 2017/02/22 by Rolando.Caloca DR - more cleanup Change 3321181 on 2017/02/24 by Rolando.Caloca DR - Fix GL bug Change 3321247 on 2017/02/24 by Rolando.Caloca DR - Fix misc bugs Change 3321898 on 2017/02/24 by Chris.Bunner Only issue clear TLV dispatch if required. #jira UERNDR-193 Change 3321904 on 2017/02/24 by Chris.Bunner Added comment for potential future optimization. Change 3322013 on 2017/02/24 by Uriel.Doyon Fixed separate translucency being affected by Gaussian DOF #jira UE-40489 Change 3322517 on 2017/02/24 by Uriel.Doyon Fixed issue with InvestigateTexture command removing budget limit. Fixed StreamingBounds show flag not working. It nows shows the streaming bound for the currently selected textures. #jira UE-40485 Change 3323470 on 2017/02/27 by Chad.Garyet Removing DDC job from dev-rendering Change 3323479 on 2017/02/27 by Chad.Garyet Removing RDU agent type Change 3323519 on 2017/02/27 by Chad.Garyet removing NCL/LHR/SEA agent types to clean up space Change 3323639 on 2017/02/27 by Benjamin.Hyder More updates to QA-Decals Change 3324207 on 2017/02/27 by Uriel.Doyon Fixed typo ScaleTexturesByGlobalMyBias -> ScaleTexturesByGlobalMipBias Removed bad merge in FStreamingTextureLevelContext::GetBuildDataIndexRef Change 3324396 on 2017/02/27 by Uriel.Doyon Fixed an issue with the Streaming Bounds show flag interferring with the static level data initialization #jira UE-40485 Change 3325227 on 2017/02/28 by Chris.Bunner Fix-up AMD AGS libs. Change 3325566 on 2017/02/28 by Uriel.Doyon Fixed possible out-of-bound access in GetUsedTexture() when passing ERHIFeatureLevel::Num Change 3326009 on 2017/02/28 by Uriel.Doyon Better fix for 3325566, as the previous fix would ignore the material instance overrides. Change 3327058 on 2017/03/01 by Benjamin.Hyder Preparing TM_Shadermodels map for automation Change 3328222 on 2017/03/01 by Chris.Bunner Prevent decals from drawing in separate translucency pass. Whilst user control and material relevance were already removed, if the flag was checked before being disabled (by swapping to decal domain) this was still being read in the render loop, now explicitly ignores decals. #jira UE-42449, UE-42446 Change 3329848 on 2017/03/02 by Uriel.Doyon Added some extra logs to help track UE-42168 Change 3329977 on 2017/03/02 by Rolando.Caloca DR - Fix bad clear value Change 3330008 on 2017/03/02 by Benjamin.Hyder More preparations for QA-Decals automation Change 3330754 on 2017/03/02 by Daniel.Wright Prominent comment explaining reflection env async compute usage and why it's not overlapped with anything Change 3331451 on 2017/03/03 by Marc.Olano Manually unroll simplex noise loop to avoid PSO bug on AMD/Metal Change 3331839 on 2017/03/03 by Rolando.Caloca DR - hlslcc - add missing file to project Change 3332247 on 2017/03/03 by Rolando.Caloca DR - Fix for integrated intel PR #3305 #jira UE-42393 Change 3332259 on 2017/03/03 by Rolando.Caloca DR - Fix bad index into pixel formats PR #3237 #jira UE-41855 Change 3332305 on 2017/03/03 by Rolando.Caloca DR - OpenGL SRV for index buffers PR #3271 #jira UE-32618 Change 3332313 on 2017/03/03 by Rolando.Caloca DR - Fix for integrated intel (properly) PR #3305 #jira UE-42393 Change 3332317 on 2017/03/03 by Rolando.Caloca DR - OpenGL SRV for index buffers (properly) PR #3271 #jira UE-32618 Change 3332368 on 2017/03/03 by Rolando.Caloca DR - Minor fixes so -sm4 and -sm5 can be used on windows with OpenGL/Vulkan Change 3333690 on 2017/03/06 by Daniel.Wright [Copy] Changing movable skylight properties no longer affects static draw lists Change 3333693 on 2017/03/06 by Daniel.Wright [Copy] Added 'r.AOListMeshDistanceFields' which dumps out mesh distance fields sorted by memory size, useful for directing content optimizations Change 3333705 on 2017/03/06 by Daniel.Wright [Copy] Mesh distance fields are now 8 bit fixed point by default, but can be changed back to 16 bit floating piont with a project setting. * 8 bit uses half memory but introduces error for thin surfaces or large meshes. Change 3333721 on 2017/03/06 by David.Hill DecalProxy: Copy float FadeScreenSize to FDeferredDecalProxy for use in the render thread. This avoids pointer chasing to the UDecalComponent (game thread component). Change 3333772 on 2017/03/06 by Daniel.Wright [Copy] Scene motion blur data is only updated for the main renderer frames. Fixes scene captures and planar reflections breaking object motion blur. Change 3333790 on 2017/03/06 by Daniel.Wright [Copy] Mesh distance field generation uses Embree, for a 2.5x speedup * Can switch back to old kDOP generation with 'r.DistanceFieldBuild.UseEmbree 0' for debugging Change 3333822 on 2017/03/06 by Daniel.Wright [Copy] Moved mesh distance field code into MeshDistanceFieldUtilities.cpp * Moved FMeshUtilities to its own header so the 8k line MeshUtilites.cpp file can be further split up Change 3333827 on 2017/03/06 by Daniel.Wright [Copy] Range compress 8bit distance fields - gets one extra bit of precision on average Change 3333828 on 2017/03/06 by Daniel.Wright [Copy] Raised High ShadowQuality to 2048 as 1024 for CSM is way too low Change 3333831 on 2017/03/06 by Daniel.Wright Non-editor compile fix Change 3333836 on 2017/03/06 by Daniel.Wright [Copy] Workaround for gobal distance field volume textures being bloated by 4x on PS4 due to the recommended tiling modes. They now use a 2d tiling mode which avoids the bloat, saving 96Mb. Change 3333843 on 2017/03/06 by Daniel.Wright [Copy] Added OcclusionExponent to skylight component * Useful for brightening up indoors without losing contact shadows as MinOcclusion does Change 3333845 on 2017/03/06 by Daniel.Wright [Copy] Capsule shadow BP functions Change 3333850 on 2017/03/06 by Daniel.Wright [Copy] Added OcclusionCombineMode to skylight component Change 3333854 on 2017/03/06 by Daniel.Wright [Copy] Gnm properly registers clears as GPU work so those events show up in profilegpu Change 3333857 on 2017/03/06 by Daniel.Wright [Copy] Clear light attenuation for local lights with a quad covering their screen extents * Clearing the entire light attenuation buffer costs .1ms on PS4. This optimization lowers the minimum cost of a shadow casting light from .15ms -> .03ms. * Shadowed lights in Fortnite with 25 lights 3.7ms -> 1.42ms on PS4 Change 3333860 on 2017/03/06 by Daniel.Wright [Copy] Flush deferred deletes when reallocating distance field atlas to reduce peak memory Change 3333861 on 2017/03/06 by Daniel.Wright [Copy] Disable all distance field features on Intel cards as HD 4000 hangs in the RHICreateTexture3D call to allocate the large atlas Change 3333869 on 2017/03/06 by Daniel.Wright [Copy] Volumetric Fog using a volume texture mapped to the camera frustum * Volumetric fog can be enabled on an Exponential Height Fog component with additional controls * Lights have a VolumetricScatteringIntensity * New cvars r.VolumetricFog, r.VolumetricFog.GridPixelSize, r.VolumetricFog.GridSizeZ, r.VolumetricFog.DepthDistributionScale * Lighting features supported: * Directional light with CSM and a light function * Point / spot lights without shadows / light functions / IES profiles * Skylight with occlusion from distance fields * Analytical height fog covers the view range past where the volumetric fog ends * Temporal reprojection is used on the volumetric fog scattering and extinction to achieve stability * Translucency integrates properly into volumetric fog * Height fog StartDistance is not supported by volumetric fog and should be set to 0. Change 3333894 on 2017/03/06 by Daniel.Wright [Copy] Initialize GDummyVolumetricFogGlobalDataUniformBuffer outside of parallel rendering Change 3333902 on 2017/03/06 by Daniel.Wright [Copy] Better handling of volumetric fog enabled with distance of 0 Change 3333903 on 2017/03/06 by Daniel.Wright [Copy] Fixed volumetric fog trying to render light functions for a point light Change 3333908 on 2017/03/06 by Daniel.Wright [Copy] Volumetric materials * Added new material domain Volume, which can output Scattering, Absorption and Emissive. All properties are in world space densities. * Particle systems using the Volume domain are voxelized based on their ParticlePosition and ParticleRadius * Volumetric fog integration is now energy conservative - scattering is integrated against transmission over the depth of each slice. * Added bOverrideLightColorsWithFogInscatteringColors to exponential height fog, which can be enabled to make Volumetric Fog match Height fog more closely Change 3334134 on 2017/03/06 by Daniel.Wright [Copy from Michael Trepka] Added Embree 2.14.0 and changed MeshUtilities to use it as this solves issues with Embree leaking TLS keys. UnrealLightmass is still using older Embree 2.7.0 until we can find time to properly test it with the new version. Also, invalidated distance field DDC to force it to rebuild with updated Embree. Change 3334420 on 2017/03/06 by Daniel.Wright Fixed RTDF shadows Change 3335467 on 2017/03/07 by Benjamin.Hyder Initial submission of QA-Decals map to EngineTest Change 3335556 on 2017/03/07 by Daniel.Wright Changed mesh distance field default format back to R16f Change 3338020 on 2017/03/08 by Daniel.Wright Disable volumetric fog in vertex shaders for feature levels which don't support it Change 3339394 on 2017/03/09 by Chris.Bunner Correctly handle material texture translation error edge case. #jira UE-42579, UE-42670 Change 3339992 on 2017/03/09 by Daniel.Wright Only compile volumetric fog shaders on supporting platforms Change 3341858 on 2017/03/10 by Arne.Schober Copying //UE4/Dev-Rendering-PSO to Dev-Rendering (//UE4/Dev-Rendering) #RB Rolando.Caloca, Marcus.Wassmer, Daniel.Wright, Nick.Penwarden, Mark.Satterthwaite Change 3342004 on 2017/03/10 by Arne.Schober Copying //UE4/Dev-Rendering-PSO to Dev-Rendering (//UE4/Dev-Rendering) Fix unity build #RB Marcus.Wassmer Change 3343307 on 2017/03/13 by Marcus.Wassmer Update showflags when we are guaranteed it will happen in all possible ways to spawn the scenecapture. (drag into editor, PIE, -game, etc) Change 3343732 on 2017/03/13 by Rolando.Caloca DR - Vulkan compute pipeline & refactor Change 3344846 on 2017/03/14 by Rolando.Caloca DR - Android compile fixes Change 3344883 on 2017/03/14 by Rolando.Caloca DR - Add missing stencil load/store to PSO initializer Change 3344985 on 2017/03/14 by Rolando.Caloca DR - Made load/store actions uint8 Change 3345141 on 2017/03/14 by Rolando.Caloca DR - vk - Rework render pass hash Change 3345304 on 2017/03/14 by Benjamin.Hyder Updating TM-Distancefields map to include TemplateFloor mesh Change 3345387 on 2017/03/14 by Rolando.Caloca DR - Add _RenderThread calls for Create*Shader so RHIs can choose not to stall when creating Change 3345388 on 2017/03/14 by Rolando.Caloca DR - Do not stall when creating shaders on Vulkan Change 3345722 on 2017/03/14 by Chris.Bunner PR #3357: MinimalAPI add to many material expressions (Contributed by DeanoC) #jira UE-42752 Change 3345723 on 2017/03/14 by Chris.Bunner Reduce log verbosity causing spamming during landscape editing. #jira UE-42714 Change 3345725 on 2017/03/14 by Chris.Bunner [Duplicate 3341860] Fixed material translation error with multiple connections from custom interpolator nodes. Change 3345726 on 2017/03/14 by Chris.Bunner Typo fixes. Change 3345732 on 2017/03/14 by Rolando.Caloca DR - Decouple vertex declaration off BSS Change 3345746 on 2017/03/14 by Chris.Bunner Added sign() intrinsic material graph node and delisted material function workaround. Change 3346042 on 2017/03/14 by Chris.Bunner Implement missing size query interface for FRenderTargetResources. #jira UE-41672 Change 3346387 on 2017/03/14 by Daniel.Wright [Copy] Added VolumetricScatteringIntensity to particle lights Change 3346389 on 2017/03/14 by Daniel.Wright [Copy] Clamp Volumetric material attributes to fp16 range to avoid INFs Disable volumetric fog when the fog show flag is disabled Change 3346392 on 2017/03/14 by Daniel.Wright [Copy] Fixed skylight being much too bright on volumetric fog Change 3346406 on 2017/03/14 by Daniel.Wright [Copy] CSM resolution is now controlled by r.Shadow.MaxCSMResolution. * Changed HighPC to use 1024 MaxShadowResolution (max for all non-CSM shadows), saves 60Mb in Fortnite Change 3346412 on 2017/03/14 by Daniel.Wright [Copy] TexCreate_ReduceMemoryWithTilingMode for translucency lighting 3d textures, saves 13Mb Change 3346414 on 2017/03/14 by Daniel.Wright [Copy] TexCreate_ReduceMemoryWithTilingMode for volumetric fog 3d textures, saves 13Mb Change 3346415 on 2017/03/14 by Daniel.Wright [Copy] Missing file from cl 3338451 Change 3346421 on 2017/03/14 by Daniel.Wright [Copy] Fixed NaNs in volumetric fog due to rendering when height fog is disabled * Volumetric fog converts NaNs to black now so they don't spread Change 3346422 on 2017/03/14 by Daniel.Wright [Copy] Fixed NaN in volumetric fog with low density values Change 3346423 on 2017/03/14 by Daniel.Wright [Copy] Changed default VolumetricFogScatteringDistribution to .2 Change 3346430 on 2017/03/14 by Daniel.Wright [Copy] New translucent material option to compute fog per pixel instead of the default per vertex Change 3346432 on 2017/03/14 by Daniel.Wright [Copy] Moved Volumetric Fog parameters to view uniform buffer for translucency pass Fixed lifetimes of temporary Volumetric Fog render targets Change 3346526 on 2017/03/14 by Daniel.Wright [Copy] Volumetric Fog supports point and spot light shadows * These lights are injected separately so that per-light resources can be bound (shadow depth map, static shadow depth map) * Forward lighting of local lights can be forced with 'r.VolumetricFog.InjectShadowedLightsSeparately 0' * Shadowed lights come at a cost: 2.9ms for volumetric fog on 970 -> 4.2ms with shadowing Change 3347053 on 2017/03/15 by Rolando.Caloca DR - android compile fix Change 3347384 on 2017/03/15 by Rolando.Caloca DR - Fix merge issue Change 3347643 on 2017/03/15 by Marcus.Wassmer Fix some bugs with the 'disable stationary skylight ffor the project' feature. Fixes lighting in Persona on Paragon. Change 3347979 on 2017/03/15 by Rolando.Caloca DR - Allow to automatically apply cached rendertargets to PSO initializer Change 3348024 on 2017/03/15 by Rolando.Caloca DR - Remove NullPS on Vulkan to avoid deadlock Change 3348303 on 2017/03/15 by Rolando.Caloca DR - Fix for debugging SCW with material SRT Change 3348357 on 2017/03/15 by Marcus.Wassmer Fix stencildither and a stencilref bug that was probably breaking decals sometimes. Change 3348549 on 2017/03/15 by Marcus.Wassmer Hopefully fix static analysis for potential nullptr access. Change 3348614 on 2017/03/15 by Marcus.Wassmer Duplicate some switch changes to fix crash on launch. Change 3349369 on 2017/03/16 by Gil.Gribb Fixed botched merge Change 3349947 on 2017/03/16 by Rolando.Caloca DR - Fix for mismatched primitive type Change 3349956 on 2017/03/16 by Benjamin.Hyder initial updates to TM-DistanceFields map Change 3350151 on 2017/03/16 by Rolando.Caloca DR - Fix UT compile issue Change 3350155 on 2017/03/16 by Rolando.Caloca DR - Catch mismatched primitive type on PSOs on D3D11 Change 3350192 on 2017/03/16 by Daniel.Wright Fix for point light shadow depths rendering with wrong cull mode due to PSO refactor Change 3350736 on 2017/03/16 by Daniel.Wright Fixed formatting from merge Change 3350881 on 2017/03/16 by Rolando.Caloca DR - Fix texture arrays as UAVs on Metal Change 3350927 on 2017/03/16 by Rolando.Caloca DR - Fix warning Change 3350935 on 2017/03/16 by Daniel.Wright Fix for materials with non-Surface domains being skipped in mesh passes Change 3351583 on 2017/03/17 by Marcus.Wassmer Fix clang platforms Change 3351917 on 2017/03/17 by Marcus.Wassmer Fix linux compile Change 3351973 on 2017/03/17 by Marcus.Wassmer Fix mismatched rendertargetformat Change 3352038 on 2017/03/17 by Daniel.Wright Enabled GetAndOrCreateGraphicsPipelineState ensures in Development for testing Change 3352110 on 2017/03/17 by Marcus.Wassmer Fix missing RT PSO apply Change 3352695 on 2017/03/17 by Arne.Schober DR - Remove PSO Rendertarget check in DX12 Resolve with Shader. #RB Rolando.Caloca Change 3352960 on 2017/03/17 by Arne.Schober DR - Fix some things that slipped trough the PSO merge #RB none Change 3353150 on 2017/03/18 by Rolando.Caloca DR - compile fix Change 3353205 on 2017/03/18 by Arne.Schober DR - Fix Incremental Compile and PS4 runtime error where CMASK is not allowed for ThickTile Mode #RB none Change 3353207 on 2017/03/18 by Arne.Schober DR - Fix Confusion #RB none Change 3355183 on 2017/03/20 by Nick.Bullard Fixed up Content orginzation for Decals automation tests in EngineTest Change 3355627 on 2017/03/20 by Arne.Schober DR - [UE-43094] - removed ensure in comporiton graph as control of the clear color cannot be gurantueed. Change 3356342 on 2017/03/21 by Marcus.Wassmer Fix clang errors Change 3356591 on 2017/03/21 by Arne.Schober DR - Fix ensure message #RB none Change 3356873 on 2017/03/21 by Arne.Schober DR - Fix comparission of undefined values in RendertargetApply Check Change 3357261 on 2017/03/21 by Marcus.Wassmer Fix LinuxEditor compile Change 3357294 on 2017/03/21 by Marcus.Wassmer Add missing SSE functions Change 3357351 on 2017/03/21 by Frank.Fella Fix win32 and linux compiler errors Change 3357370 on 2017/03/21 by Arne.Schober DR - disable ensure in test builds #RB Marcus.Wassmer [CL 3357449 by Marcus Wassmer in Main branch]
2017-03-21 17:46:52 -04:00
GraphicsPSOInit.BoundShaderState.VertexDeclarationRHI = GFilterVertexDeclaration.VertexDeclarationRHI;
GraphicsPSOInit.BoundShaderState.VertexShaderRHI = GETSAFERHISHADER_VERTEX(*VertexShader);
GraphicsPSOInit.BoundShaderState.PixelShaderRHI = GETSAFERHISHADER_PIXEL(*PixelShader);
GraphicsPSOInit.PrimitiveType = PT_TriangleList;
SetGraphicsPipelineState(RHICmdList, GraphicsPSOInit);
Copying //UE4/Dev-Rendering to //UE4/Dev-Main (Source: //UE4/Dev-Rendering @ 3274304) #lockdown Nick.Penwarden #rb none ========================== MAJOR FEATURES + CHANGES ========================== Change 3250856 on 2017/01/09 by Daniel.Wright Only showing instruction count for 'Base pass shader' now Change 3250943 on 2017/01/09 by Rolando.Caloca DR - Async Compute PSO creation Change 3251036 on 2017/01/09 by Rolando.Caloca DR - Add r.AsyncPipelineCompile - Dispatch on any thread - Wait for completion event Change 3251058 on 2017/01/09 by Ben.Woodhouse Fix for PSO creation D3D error with NumRenderTargets. Add code to compute the correct number of valid rendertargets to prevent an issue during PSO creation when NumRenderTargets is >0, but none of the formats are valid (all formats are DXGI_UNKNOWN) #jira UE-40332 Change 3251141 on 2017/01/09 by Ben.Woodhouse Duplicated from Fortnite CL 3243458: D3D12 memory optimization - The d3d12 buddy suballocator is very wasteful for allocations above 4KB, but the vast majority of allocations are smaller . In the default buffer allocator this was causing 149MB of waste in 340MB of allocations. Moving the max allocation size threshold down to 4KB from 512KB saved 100MB of memory wastage memory. On PC, buffers are 64KB aligned, so we need the threshold to be higher to avoid additional wastage. Add PIX memory tracking instrumentation for buddy allocators so we can track the memory properly in PIX Change 3251142 on 2017/01/09 by Ben.Woodhouse Duplicated from Fortnite 3243496 memory optimisation: use NULL-terminated ansi strings instead of unicode FStrings for symbols, saving 118MB. Previously the strings were loaded from disk as ansi and then converted to FStrings (slowly), before finally being converted them back to ansi strings before being used. In addition to reducing memory overhead, this change reduces complexity and improves startup time. Change 3252323 on 2017/01/10 by Rolando.Caloca DR - Gfx async PSO creation prep Change 3252474 on 2017/01/10 by Daniel.Wright Added 'Compile Unreal Lightmass' to error message Change 3252589 on 2017/01/10 by Daniel.Wright Back out bulk data for distance fields from cl 3241990 which causes distance fields to be corrupt in Fortnite Change 3252790 on 2017/01/10 by Daniel.Wright Added InscatteringColorCubemapAngle to exponential height fog Change 3252843 on 2017/01/10 by Uriel.Doyon Propper fix for UE-40211, where texture streaming bound defrag and async tasks could interact in coherent ways. The bound defrag is now done outside of the async work logic. Change 3252866 on 2017/01/10 by Mark.Satterthwaite Fix Metal shader pipeline hash collisions caused by deferring MTLFunction construction until PrepareToDraw so that we may use Function-Constants to specialise the shader source without generating additional permutations. This is required to generate proper tessellation shaders which are specialised against the index-buffer usage & type (none, uint16, uint32). While we're here amend the hash functions to make better use of the existing hash functions to improve the distribution and hopefully reduce the possibility of collisions in future. #jira UE-40357 Change 3254511 on 2017/01/11 by Rolando.Caloca DR - PSO stats Change 3255958 on 2017/01/12 by Mark.Satterthwaite Reimplement RQT_AbsoluteTime for Metal - pretty sure I did this before, but somehow it got lost. When a RQT_AbsoluteTime is inserted into the command-stream, insert a command-buffer completion handler to record the time of completion & submit the command-buffer immediately. This breaks command-buffers so is noticeably slower and if inserted in a pass that can't be restarted will fail but is currently the only option available. This is sufficient to support the GPUBenchmark used by Scalability. To make this more efficient I've refactored the FMetalCommandBufferFence implementation so that we use a single shared-ptr object containing the command-buffer and a dispatch semaphore, rather than allocating one for each query. The semaphore allows for timed-waits where previously we'd block until completion, unlike the other APIs that report failure after a fixed interval (2s for RQT_AbsoluteTime, otherwise 0.5s). Sadly not all drivers support this abuse of the Metal API, so replace the GL-based workaround for not having time queries with one that just guesses based on RHI device details. Radars will be filed. #jira UE-40554 Change 3256329 on 2017/01/12 by Olaf.Piesche #jira UE-38615 Assert shouldn't be necessary; in fact, it causes a crash when exporting emitters, since in that case we're changing the template at runtime. Change 3256371 on 2017/01/12 by Uriel.Doyon Reenabled texture streaming bound defrag as the fix is in CL 3252843 Change 3257032 on 2017/01/13 by Daniel.Wright Added fastClamp to fastmath.usf Change 3257111 on 2017/01/13 by Daniel.Wright Disabled bAffectDistanceFieldLighting on DefaultPawn, fixes VisualizeMeshDistanceFields in game Change 3257112 on 2017/01/13 by Daniel.Wright DFAO optimizations * Changed the culling algorithm to produce a list of intersecting screen tiles for each object, instead of the other way around. Each tile / object intersection gets its own cone tracing thread group so wavefronts are much smaller and scheduled better. 3.63ms -> 3.48ms (.15ms) * Replace slow instructions in inner loop with fast approximations (exp2 -> sqr + 1, rcpFast, lengthFast) 3.25ms -> 3.09ms (.16ms) * Moved transform from world to local space out of the inner loop (sample position constructed from local space position + direction) 3.09ms -> 3.04ms * Compute shader for ClearUAV 3.04ms -> 2.62ms (.42ms) Change 3257113 on 2017/01/13 by Daniel.Wright Better distance field memory stats Change 3257326 on 2017/01/13 by Uriel.Doyon Workaround to support cases where several textures have the same lighting GUID. Change 3257448 on 2017/01/13 by Daniel.Wright Removed legacy features Distance Field Specular Occlusion, Distance Field Surface Cache AO, PreCullTriangles Change 3257616 on 2017/01/13 by Daniel.Wright Distance field mesh visualization now uses a cone containing the entire tile to cull objects with, making the results stable Change 3257657 on 2017/01/13 by Daniel.Wright Mesh distance fields are stored zlib compressed in memory until needed for uploading to GPU * 81Mb of backing memory -> 32Mb in GPUPerfTest, atlas upload time 29ms -> 893ms Change 3258063 on 2017/01/14 by Rolando.Caloca DR - vk - Refactor descriptor set reuse in prep for more changes Change 3258715 on 2017/01/16 by Daniel.Wright Added VisualizeGlobalDistanceField show flag Change 3258827 on 2017/01/16 by Daniel.Wright Global distance field update regions are clipped against others to reduce redundant updates. Change 3258959 on 2017/01/16 by Benjamin.Hyder Updating Planar Reflection example material in TM-Shadermodels Change 3259270 on 2017/01/16 by Daniel.Wright [Copy] 'r.MSAACount 1' now produces no MSAA or TAA. 'r.MSAACount 0' can be used to toggle TAA on for comparisons. Change 3259652 on 2017/01/16 by Uriel.Doyon Better support for static primitive becoming dynamic. Change 3260107 on 2017/01/17 by Ben.Woodhouse Fix FMonitoredProcess to prevent infinite loop in -nothreading mode #jira UE-40717 Change 3260594 on 2017/01/17 by Daniel.Wright Added a new global distance field (4x 128^3 clipmaps) which caches mostly static primitives (Mobility set to Static or Stationary) * The full global distance field inherits from the mostly static cache, so when a Movable primitive is modified, only other movable primitives in the vicinity need to be re-composited into the global distance field * Global distance field update cost with one large rotating object went from 2.5ms -> .2ms on 970GTX and 4.6ms -> .3ms. Worst case full volume update is mostly the same. * Adds 12Mb for the new volume textures Change 3260956 on 2017/01/17 by Daniel.Wright Structured buffers for DF object data * Full global distance field clipmap composite 3.0ms -> 2.0ms due to scalarized loads Change 3261296 on 2017/01/17 by Daniel.Wright Exposed MaxObjectsPerTile with 'r.AOMaxObjectsPerCullTile' and lowered the default from 512 to 256, saves 17Mb of object tile culling data structures Removed unnecessary UAV transitions preventing object and global cone tracing from overlapping, saves ~.1ms Change 3262036 on 2017/01/18 by Ben.Salem V0 of Perf monitor plugin for easily consumable stat csvs. With plugin enabled, enter PerformanceMonitor help into the console to get usage details. Change 3262056 on 2017/01/18 by Chris.Bunner Remove inverse tonemapping when rendering HDR output. #jira UE-40728 Change 3262661 on 2017/01/18 by Rolando.Caloca DR - Add missing SetStencilRef() and SetBlendFactor() on most RHIs - Fix hash for PSOs Change 3263674 on 2017/01/19 by Chris.Bunner PR #3144: Improved error messages (Contributed by DarkSlot) #jira UE-40835 Change 3264150 on 2017/01/19 by Ben.Woodhouse Add support for single threaded in FMonitoredProcess. Deprecated IsRunning() in favour of a new Update() method because polling IsRunning is not compatible with -nothreading mode #jira UE-40841 Change 3264153 on 2017/01/19 by Ben.Woodhouse Integrate latest changes from MS-DX12 CLs 3231395-3262526 - Added WinPixEventRuntime.tps - Includes PIX support, various optimizations (saved 1.3ms in testbed scene) CL 3262343: Fix depth testing on translucency not working correctly after cl 3231395. This change reapplies the D3D12RHI changes from CL 3231395 because those changes were lost when integrating from //Dev-Rendering/ but also includes the depth fixes: - Fix depth state not being in DEPTH_READ for use as depth read. The issue was HasDepthBits and HasStencilBits wern't intended for SRV formats and always returned false in the SRV case. CL 3231395: Update D3D12 RHI: - Fix deferred MSAA path in RHI - Add Pix3.h support - Cleanup SetName usage and remove it from shipping builds. - Fix fence reuse bug. We were signaling MAX UINT (-1) and then waiting for 0, which was always signaled. This change also removes the fence value reset code, there is no need to reset a fence to a previous value. - Use FPlatformAtomics::InterlockedIncrement instead of InterlockedIncrement64 - Use InterlockedIncrement() instead of _InterlockedIncrement() and use the FPlatformAtomics:: version. - Fix possible readback heap being evicted while in use. GetQueryData happens on the render thread and isn't tied to a command list so we should always have readback heaps resident. Change 3264251 on 2017/01/19 by Mark.Satterthwaite Modify some asserts in MetalRHI - technically using a store-action of ENoAction on Stencil buffers should make it invalid to restart a render-pass but on Mac it will work because ENoAction won't invalidate anything written. In future we need to use deferred store-actions in Metal so that we can "restart" passes while enforcing correct Load/Store actions. #jira UE-40803 Change 3264642 on 2017/01/19 by Daniel.Wright Raised GMaxShadowDepthBufferSizeX to max texture resolution on most platforms, was previously 4096. Change 3265330 on 2017/01/20 by Ben.Salem Stop performance plugin from building in Win32. #tests recompiled and preflighted Change 3265678 on 2017/01/20 by Marcus.Wassmer Fix bad declaration. #3055 Change 3266656 on 2017/01/20 by Mark.Satterthwaite Changes to the FShaderCache to restore it and extend it to optionally report on shader de-duplication when generating a binary shader cache (Console Variable: r.BinaryShaderCacheLogging). Duplicate & amend CL #3266053 from Trepka: Fixed issues with shader cache not working properly with Mac Metal (but it still requires -norhithread to work at all). Enabled the shader cache by default if RHI thread is disabled. Amend & integrate RCO's CL #3197085. Change 3267741 on 2017/01/23 by Rolando.Caloca DR - Detect duplicated shader and pipeline types Change 3268600 on 2017/01/23 by Uriel.Doyon Added missing r.Streaming.MaxEffectiveScreenSize config to base texture scability settings. Integrated CL 3227368 from Orion stream Enabled r.Streaming.UsePerTextureBias by default as this has been tested in Orion for several months. Fixed issue with the InvestigateTexture command which could return invalid reference depending on the timing, Added th MaxEffectiveScreenSize settings in the investigate texture command. Change 3269512 on 2017/01/24 by Richard.Wallis Fix for shader binary cache uncompress data size during internal shader log. Change 3271237 on 2017/01/25 by Ben.Woodhouse D3D12 updateTexture2D crash fix #jira UE-41059 Change 3271564 on 2017/01/25 by Olaf.Piesche #jira UE-40980 #udn 325525 Fix uniform buffers for mesh particles; these should really be on the mesh collector, so allocating them as a one frame resource is safe Change 3271594 on 2017/01/25 by Ben.Woodhouse ESRAM support stage 1: Implemented noncontiguous ESRAM page allocator replacing XgMemoryLayout API. The allocator allocates non-contiguous ranges of pages and maps them onto a contiguous virtual address range. Unlike the previous implementation, this allocator frees pages for reuse when resources are destroyed Note: issues with deferred deallocation may prevent reuse in many cases - that will be addressed in the next stage Support for the old allocator is still available (for now) via the define NEW_ESRAM_ALLOCATOR #fyi rolando.caloca Change 3272616 on 2017/01/25 by Rolando.Caloca DR - Update shader version Change 3273138 on 2017/01/26 by Ben.Woodhouse Fix merge issue with MonitoredProcess.cpp (this arose from an integration made as an edit in dev-rendering, which confused perforce when the change was subsequently integrated from main) [CL 3274498 by Rolando Caloca in Main branch]
2017-01-26 19:20:49 -05:00
PixelShader->SetParameters(RHICmdList, ViewInfo, VisualizeResultRT);
DrawRectangle(
RHICmdList,
0, 0,
ViewInfo.ViewRect.Width(), ViewInfo.ViewRect.Height(),
ViewInfo.ViewRect.Min.X / GAODownsampleFactor, ViewInfo.ViewRect.Min.Y / GAODownsampleFactor,
ViewInfo.ViewRect.Width() / GAODownsampleFactor, ViewInfo.ViewRect.Height() / GAODownsampleFactor,
FIntPoint(ViewInfo.ViewRect.Width(), ViewInfo.ViewRect.Height()),
GetBufferSizeForAO(),
*VertexShader);
}
}
}
}
}