Files
UnrealEngineUWP/Engine/Documentation/Source/Programming/Rendering/ShaderDevelopment/AsyncCompute/AsyncCompute.INT.udn
Ben Marsh 40fd45b376 Copying //UE4/Dev-Documentation to Samples-Main (//UE4/Samples-Main)
#rb none
#lockdown Nick.Penwarden

[CL 3239839 by Ben Marsh in Main branch]
2016-12-19 13:09:22 -05:00

60 lines
3.7 KiB
Plaintext

Availability:Public
Title:AsyncCompute
Crumbs:%ROOT%, Programming, Programming/Rendering, Programming/Rendering/ShaderDevelopment
Description:This is about AsyncCompute, a hardware feature that allows to interleave different GPU work to improve efficiency.
Version: 4.9
tags:Rendering
The Rendering Hardware Interface (RHI) now supports asynchronous compute (**AsyncCompute**) for Xbox One. This is a good way to utilize unused GPU resources (Compute Units (CUs), registers and bandwidth),
by running **dispatch()** calls asynchronously with the rendering. Async compute uses a separate context, and we provide RHI functions to
synchronize the rendering and compute contexts.
Dr PIX is useful for identifying areas which might benefit from async compute. For example, if half the CUs are unused during a particular
rendering pass, those CUs could potentially be utilized by an asynchronous compute job.
Async compute has some restrictions:
* Buffers with UAV counters are not supported (this is a limitation of the XDK, and will generate a D3D warning)
* Async compute jobs do not show up in PIX GPU captures (although they do appear in Timing Captures)
PIX only captures the graphics immediate context (this is a platform limitation).
* Automatic pipeline flushes are not provided by the driver. You need to explicitly call RHICSManualGpuFlush if flushes are needed
(e.g. if one dispatch is dependent on the previous one)
## API
* **RHIBeginAsyncComputeJob_DrawThread** (EAsyncComputePriority Priority)
Begins an async compute job from the rendering thread. The priority is used to determine the number of shader resources to allocate
for the job (via ID3D11XComputeContext::SetLimits). There are currently two priorities available, AsyncComputePriority_High and
AsyncComputePriority_Default.
* **RHIEndAsyncComputeJob_DrawThread**
Ends an async compute job. Returns a 32-bit Fence Index which can be used for synchronization, or -1 if compute is disabled or
there is no active compute job
* **RHIGraphicsWaitOnAsyncComputeJob**
Inserts an command in the graphics immediate context to block until an async job is complete. Passing -1 causes this to early-out.
Between calls to RHIBeginAsyncComputeJob_DrawThread and RHIEndAsyncComputeJob_DrawThread, the RHI will be in the async compute state.
During this time, supported RHI commands will be executed via the async compute context. Unsupported RHI functions will assert.
## Disabling/Enabling
Async compute can be enabled or disabled at compile time with the #define USE_ASYNC_COMPUTE_CONTEXT. It can be disabled at runtime
with the r.AsyncCompute console variable. When async compute is disabled, dispatches within async compute blocks are executed synchronously on
the graphics command buffer. USE_ASYNC_COMPUTE_CONTEXT is defined to 0 on PC, since it's not supported in D3D11.1.
## PIX
Async compute context jobs are not captured in GPU captures, so these captures can give a misleading picture of GPU performance when
async compute is enabled. For graphics debugging purposes, async compute should be disabled using the above cvar.
Async compute is supported by PIX timing captures. These show up in the timeline like this:
![](PixTimingCapture.png)
## Thanks and Future
This feature was implemented by Lionhead Studios. We integrated it and indend to make use of it as a tool to optimize the XboxOne rendering.
As more more APIs expose the hardware feature we would like make the system more cross platform. Features that make use use AsyncCompute you always
be able to run without (console variable / define) to run on other platforms and easier debugging and profiling. AsyncCompute should be used with caution
as it can cause more unpredicatble performance and requires more coding effort for synchromization.