Files
UnrealEngineUWP/Engine/Source/Runtime/Renderer/Private/PostProcess/PostProcessHistogram.cpp
Nick Penwarden 6d5e1da95f Copying //UE4/Dev-Rendering to Dev-Main (//UE4/Dev-Main)
#lockdown ben.marsh

==========================
MAJOR FEATURES + CHANGES
==========================

Change 2774277 on 2015/11/19 by Gil.Gribb

	UE4 - Did minor optimizations to the PS4 RHI and drawlists.

Change 2791226 on 2015/12/04 by Uriel.Doyon

	Added source code for Embree 2.7.0
	Removed duplicate files from the /doc folder.

Change 2800193 on 2015/12/11 by Marcus.Wassmer

	SSAO AsyncCompute support.
	#rb Martin.Mittring

Change 2801631 on 2015/12/14 by Olaf.Piesche

	Making auto deactivate true by default, moving checks to HasCompleted, eliminating some unnecessary logic

	#rb martin.mittring

Change 2803240 on 2015/12/15 by Gil.Gribb

	UE4 - Added command to collect stats on spammy stats.

Change 2803476 on 2015/12/15 by Rolando.Caloca

	DR - Allow toggling compute skin dispatch at runtime
	- r.SkinCacheShaders Now enable the shaders and feature
	- r.SkinCaching enables toggling at runtime
	- r.SkinCache.BufferSize Sets the size in bytes of buffer for outputting
	- Now uses 3 UAV buffers instead of one (avoid RenderDoc crashes)
	#codereview Marcus.Wassmer, Martin.Mittring

Change 2803940 on 2015/12/15 by Marcus.Wassmer

	Add r.PS4.AsyncComputeBudgetMode to switch between CUMasking and WaveLimit modes.  So far it looks like WaveLimits behave better in UE4.

	Also rearrange AsyncSSAO to run immediately after HZB to overlap with occlusion queries.  In my testing this takes SSAO cost from .5ms -> .2ms.   However it had to be hacked to run without normals.  Hopefully Martin can get some real AsyncSSAO in.

	#rb Martin.Mittring
	#codereview Martin.Mittring

Change 2803999 on 2015/12/15 by Uriel.Doyon

	Refactored the shader complexity material override logic to allow other viewmodes shader overrides.
	TexelFactorAccuracy ViewMode : shows the accuracy of the static mesh texel factors, used for streaming.
	WantedMipsAccuracy ViewMode : shows the accuracy of the static mesh wanted mips accuracy, used for streaming.
	Added an option to stream textures based on the AABB distance instead of using the sphere approximation.
	Added an option to only keep a the wanted mips.
	Moved optimization related viewmodes into a submenu to avoid polluting the interface.
	#jira UE-24502
	#jira UE-24503
	#jira UERNDR-89

Change 2804150 on 2015/12/15 by Olaf.Piesche

	make separate translucency screen percentage a bit more robust; add numsamples to the render target creation functions in preparation for MSAA support for higher quality with low res separate translucency

	#rb martin.mittring

Change 2804367 on 2015/12/15 by Daniel.Wright

	Capsule shadow primitives are tracked separately on registration - saves 2.6ms of RT time doing the view frustum culling in a medium sized map

Change 2805293 on 2015/12/16 by Olaf.Piesche

	logging if potentially immortal emitters are spawned from gameplay; this should catch if we spawn burst only emitters with indefinite life spans (muzzle flashes, hit impacts, etc.)

	#rb martin.mittring

Change 2805586 on 2015/12/16 by Zabir.Hoque

	Adding support for decals to fade and destroy themselves automatically.

	#CodeReview: Martin.Mittring, Daniel.Wright, Olaf.Piesche

Change 2807663 on 2015/12/17 by Rolando.Caloca

	DR - Remove expensive logging
	#codereview Marcus.Wassmer

Change 2807903 on 2015/12/17 by Zabir.Hoque

	Refactored DecalComponent's lifetime management such that it can be set and reset from Blueprints.

	#CodeReview Daniel.Wright, Martin.Mittring, Olaf.Piesche

Change 2809261 on 2015/12/18 by Martin.Mittring

	Added VisualizeShadingModels to track down issues like that:
	  FORT-16913 Textures on Hero Mesh is not shown
	#rb:David.Hill
	#code_review:Bob.Tellez

Change 2810136 on 2015/12/21 by Rolando.Caloca

	DR - Added back draw event colors
	PR #1602
	#jira UE-21526
	#codereview Mark.Satterthwaite, Keith.Judge, Marcus.Wassmer, Josh.Adams

Change 2810680 on 2015/12/21 by Martin.Mittring

	moved SSAO ComputeShader running without per pixel normal (for AsyncCompute) into DevRendering
	#test:editor

Change 2811205 on 2015/12/22 by Brian.Karis

	Pulled clear coat out of the reflection compute shader. Added permutation for skylight.

	Clear coat base layer now done in base pass. It only picks up the closest capture. This will cause popping when the object moves. Still needs a cross fade.

Change 2811275 on 2015/12/22 by David.Hill

	UE-24675
	#rb martin.mittring

	Corrected buffer-size related problem with fringe.

Change 2811397 on 2015/12/22 by Brian.Karis
2016-01-08 11:12:28 -05:00

173 lines
7.1 KiB
C++

// Copyright 1998-2016 Epic Games, Inc. All Rights Reserved.
/*=============================================================================
PostProcessHistogram.cpp: Post processing histogram implementation.
=============================================================================*/
#include "RendererPrivate.h"
#include "ScenePrivate.h"
#include "PostProcessHistogram.h"
#include "PostProcessing.h"
#include "PostProcessEyeAdaptation.h"
#include "SceneUtils.h"
/** Encapsulates the post processing histogram compute shader. */
class FPostProcessHistogramCS : public FGlobalShader
{
DECLARE_SHADER_TYPE(FPostProcessHistogramCS, Global);
static bool ShouldCache(EShaderPlatform Platform)
{
return IsFeatureLevelSupported(Platform, ERHIFeatureLevel::SM5);
}
static void ModifyCompilationEnvironment(EShaderPlatform Platform, FShaderCompilerEnvironment& OutEnvironment)
{
FGlobalShader::ModifyCompilationEnvironment(Platform, OutEnvironment);
OutEnvironment.SetDefine(TEXT("THREADGROUP_SIZEX"), FRCPassPostProcessHistogram::ThreadGroupSizeX);
OutEnvironment.SetDefine(TEXT("THREADGROUP_SIZEY"), FRCPassPostProcessHistogram::ThreadGroupSizeY);
OutEnvironment.SetDefine(TEXT("LOOP_SIZEX"), FRCPassPostProcessHistogram::LoopCountX);
OutEnvironment.SetDefine(TEXT("LOOP_SIZEY"), FRCPassPostProcessHistogram::LoopCountY);
OutEnvironment.SetDefine(TEXT("HISTOGRAM_SIZE"), FRCPassPostProcessHistogram::HistogramSize);
OutEnvironment.CompilerFlags.Add( CFLAG_StandardOptimization );
}
/** Default constructor. */
FPostProcessHistogramCS() {}
public:
FPostProcessPassParameters PostprocessParameter;
FShaderResourceParameter HistogramRWTexture;
FShaderParameter HistogramParameters;
FShaderParameter ThreadGroupCount;
FShaderParameter LeftTopOffset;
FShaderParameter EyeAdaptationParams;
/** Initialization constructor. */
FPostProcessHistogramCS(const ShaderMetaType::CompiledShaderInitializerType& Initializer)
: FGlobalShader(Initializer)
{
PostprocessParameter.Bind(Initializer.ParameterMap);
HistogramRWTexture.Bind(Initializer.ParameterMap, TEXT("HistogramRWTexture"));
HistogramParameters.Bind(Initializer.ParameterMap, TEXT("HistogramParameters"));
ThreadGroupCount.Bind(Initializer.ParameterMap, TEXT("ThreadGroupCount"));
LeftTopOffset.Bind(Initializer.ParameterMap, TEXT("LeftTopOffset"));
EyeAdaptationParams.Bind(Initializer.ParameterMap, TEXT("EyeAdaptationParams"));
}
void SetCS(FRHICommandList& RHICmdList, const FRenderingCompositePassContext& Context, FIntPoint ThreadGroupCountValue, FIntPoint LeftTopOffsetValue, FIntPoint GatherExtent)
{
const FComputeShaderRHIParamRef ShaderRHI = GetComputeShader();
FGlobalShader::SetParameters(RHICmdList, ShaderRHI, Context.View);
PostprocessParameter.SetCS(ShaderRHI, Context, Context.RHICmdList, TStaticSamplerState<SF_Point,AM_Clamp,AM_Clamp,AM_Clamp>::GetRHI());
SetShaderValue(RHICmdList, ShaderRHI, ThreadGroupCount, ThreadGroupCountValue);
SetShaderValue(RHICmdList, ShaderRHI, LeftTopOffset, LeftTopOffsetValue);
FVector4 HistogramParametersValue(GatherExtent.X, GatherExtent.Y, 0, 0);
SetShaderValue(RHICmdList, ShaderRHI, HistogramParameters, HistogramParametersValue);
{
FVector4 Temp[3];
FRCPassPostProcessEyeAdaptation::ComputeEyeAdaptationParamsValue(Context.View, Temp);
SetShaderValueArray(RHICmdList, ShaderRHI, EyeAdaptationParams, Temp, 3);
}
}
// FShader interface.
virtual bool Serialize(FArchive& Ar) override
{
bool bShaderHasOutdatedParameters = FGlobalShader::Serialize(Ar);
Ar << PostprocessParameter << HistogramRWTexture << HistogramParameters << ThreadGroupCount << LeftTopOffset << EyeAdaptationParams;
return bShaderHasOutdatedParameters;
}
};
IMPLEMENT_SHADER_TYPE(,FPostProcessHistogramCS,TEXT("PostProcessHistogram"),TEXT("MainCS"),SF_Compute);
void FRCPassPostProcessHistogram::Process(FRenderingCompositePassContext& Context)
{
SCOPED_DRAW_EVENT(Context.RHICmdList, PostProcessHistogram);
const FPooledRenderTargetDesc* InputDesc = GetInputDesc(ePId_Input0);
if(!InputDesc)
{
// input is not hooked up correctly
return;
}
const FSceneView& View = Context.View;
const FSceneViewFamily& ViewFamily = *(View.Family);
FIntPoint SrcSize = InputDesc->Extent;
FIntRect DestRect = View.ViewRect;
const FSceneRenderTargetItem& DestRenderTarget = PassOutputs[0].RequestSurface(Context);
TShaderMapRef<FPostProcessHistogramCS> ComputeShader(Context.GetShaderMap());
SetRenderTarget(Context.RHICmdList, FTextureRHIRef(), FTextureRHIRef());
Context.RHICmdList.SetComputeShader(ComputeShader->GetComputeShader());
// set destination
check(DestRenderTarget.UAV);
Context.RHICmdList.TransitionResource(EResourceTransitionAccess::ERWBarrier, EResourceTransitionPipeline::EGfxToCompute, DestRenderTarget.UAV);
Context.RHICmdList.SetUAVParameter(ComputeShader->GetComputeShader(), ComputeShader->HistogramRWTexture.GetBaseIndex(), DestRenderTarget.UAV);
FIntPoint GatherExtent = ComputeGatherExtent(View);
FIntPoint ThreadGroupCountValue = ComputeThreadGroupCount(GatherExtent);
ComputeShader->SetCS(Context.RHICmdList, Context, ThreadGroupCountValue, (DestRect.Min + FIntPoint(1, 1)) / 2, GatherExtent);
DispatchComputeShader(Context.RHICmdList, *ComputeShader, ThreadGroupCountValue.X, ThreadGroupCountValue.Y, 1);
// un-set destination
Context.RHICmdList.SetUAVParameter(ComputeShader->GetComputeShader(), ComputeShader->HistogramRWTexture.GetBaseIndex(), NULL);
Context.RHICmdList.TransitionResource(EResourceTransitionAccess::EReadable, EResourceTransitionPipeline::EComputeToGfx, DestRenderTarget.UAV);
ensureMsgf(DestRenderTarget.TargetableTexture == DestRenderTarget.ShaderResourceTexture, TEXT("%s should be resolved to a separate SRV"), *DestRenderTarget.TargetableTexture->GetName().ToString());
}
FIntPoint FRCPassPostProcessHistogram::ComputeGatherExtent(const FSceneView& View)
{
// we currently assume the input is half res, one full res pixel less to avoid getting bilinear filtered input
return (View.ViewRect.Size() - FIntPoint(1, 1)) / 2;
}
FIntPoint FRCPassPostProcessHistogram::ComputeThreadGroupCount(FIntPoint PixelExtent)
{
uint32 TexelPerThreadGroupX = ThreadGroupSizeX * LoopCountX;
uint32 TexelPerThreadGroupY = ThreadGroupSizeY * LoopCountY;
uint32 ThreadGroupCountX = (PixelExtent.X + TexelPerThreadGroupX - 1) / TexelPerThreadGroupX;
uint32 ThreadGroupCountY = (PixelExtent.Y + TexelPerThreadGroupY - 1) / TexelPerThreadGroupY;
return FIntPoint(ThreadGroupCountX, ThreadGroupCountY);
}
FPooledRenderTargetDesc FRCPassPostProcessHistogram::ComputeOutputDesc(EPassOutputId InPassOutputId) const
{
FPooledRenderTargetDesc UnmodifiedRet = GetInput(ePId_Input0)->GetOutput()->RenderTargetDesc;
UnmodifiedRet.Reset();
FIntPoint PixelExtent = UnmodifiedRet.Extent;
FIntPoint ThreadGroupCount = ComputeThreadGroupCount(PixelExtent);
// each ThreadGroup outputs one histogram
FIntPoint NewSize = FIntPoint(HistogramTexelCount, ThreadGroupCount.X * ThreadGroupCount.Y);
// format can be optimized later
FPooledRenderTargetDesc Ret(FPooledRenderTargetDesc::Create2DDesc(NewSize, PF_FloatRGBA, FClearValueBinding::None, TexCreate_None, TexCreate_RenderTargetable | TexCreate_UAV, false));
Ret.DebugName = TEXT("Histogram");
return Ret;
}