You've already forked UnrealEngineUWP
mirror of
https://github.com/izzy2lost/UnrealEngineUWP.git
synced 2026-03-26 18:15:20 -07:00
The aim of this work is to improve overall CPU cost in 3 ways:
- to reduce the per-frame overhead of running ECS tasks
- to decrease latency between useful gamethread work being able to run
- to increase thread ultilization where many cores are available
In both single-threaded and multi-threaded mode this is showing a wall-clock speed improvement of ~25%, upon which other optimizations can now be made.
Previously, the Dispatch_ methods would construct tasks and filters for their work every frame. Tasks dispatched by downstream systems that were added as implicit dependencies would always depend on all the work of their upstream systems, even if they didn't necessarily need to be (for example, float properties would need to wait for *all* float channels to be evaluated and blended before they could be applied, even ones that only applied to transform properties).
The new approach introduces a new system phase (ESystemPhase::Scheduling) which is only run as part of the post-instantiation. This phase invokes a new method (OnSchedulePersistentTasks) which allows systems to define the task-structure for the current state of the ECS. Tasks defined here are stored inside the scheduler, and live for the duration of the instantiation - if the structure of the entity manager materially changes, the task graph is re-constructed.
Entity task builders are still used in this new method, but can only call Schedule_ and Fork_ functions for running tasks. These methods define serial and parallel behavior respectively that is to be run across the ECS (as should already be familiar). Crucially, each relevant entity allocation now gets its own task - the filter is run when the task graph is constructed, and a new task is created for each match. This allows us to interleave and/or distribute tasks across as many cores as possible, as soon as prerequisites are completed.
Tasks may also 'prelock' their component data - doing so will cache the location of the component data for each task (using a new type, TRelativePtr) so that it doesn't have to go hunting for component headers when the task is run.
Implementation details:
IMovieSceneTaskScheduler exists as the public front-end to the new routines, but allows us to keep all the implementation details private.
Tasks can be in one of two states: a single task can have multiple prerequisites and subsequents, a parent task can have many children which must be complete in order for the parent to be considered complete. This is used for depending on the results of forked and serial tasks that may be spawned multiple times.
When constructing the graph, we keep track of the read and write dependencies for each component type within each allocation. When a task reads from a component it must first depend on any tasks that write to the same component on the same allocation. When a task writes to a component it must depend on all upstream read and write dependencies for that component, on that allocation.
Moved TSparseBitSet to UE::MovieScene so I can use it for task dependencies
Introduced TRelativePtr which allows storage of pointers relative to another pointer using a smaller type than a raw ptr would, and isn't invalidated like a raw ptr would be if is pointing to an element in a dynamic array.
This is used for our prelocked component data, but may have uses elsewhere. Note that this is not a TRelativePtr that is relative to 'this' like some other relative pointer implementations (though one could easily be written that uses this implementation if necessary.
#rb Ludovic.Chabant, Max.Chen
#preflight 646e782e64351d76f3c508a2
[CL 25644373 by max chen in ue5-main branch]