- FastReferenceCollector will now flush struct references before suspending to avoid storing pointers to struct owned UObject references between reachability iterations
- Persistent frame weak reference clearing will no longer attempt to clear the stored reference pointer when running incremental reachability and instead will re-run its owner's AddReferencedObjects with a special collector to clear any references to unreachable objects
- Objects marked by GC barrier will now be processed immediately in the next reachability iteration instead of after reachability analsysis was complete
#rb Kirill.Zorin
[FYI] Johan.Torp
[CL 26801914 by robert manuszewski in ue5-main branch]
Make AssetManager required; engine startup gives a fatal error if AssetManager is not present.
#rn Minor, Cooking
#rb Zousar.Shaker
#preflight 63ffd322df66ed5fc11d963e
[CL 24493164 by Matt Peters in ue5-main branch]
* Updated public headers for ~170 engine plugins using iwyu to remove includes not needed. Removed includes are still available behind UE_ENABLE_INCLUDE_ORDER_DEPRECATED_IN_5_2
#preflight 63c08f4a2a6acaf1622bcc73
#rb none
[CL 23674775 by henrik karlsson in ue5-main branch]
* Updated private files with IWYU for all plugins which had 3 or less changes made in ue5 main since last integration to fn
#preflight 63bf8d8b577437afe607dc72
#rb none
[CL 23659643 by henrik karlsson in ue5-main branch]
List of optimizations and changes:
* Token stream structure
* Split token stream into strong-only and a mixed (weak+strong) stream
* Split token stream into a builder and a tighter view class which reduces sizeof(UClass)
* Implemented ref-counted token stream view sharing
* Removed Class and Outer from token stream
* Allow empty token streams (enabled by removing Class/Outer) to avoid touching token stream data
* Placed ARO (AddReferencedObjects) last to reduce per object cache thrashing, improve control flow predicability and avoid reading the last EndOfPointer and EndOfStream tokens
* FPrefetchingObjectIterator that bring in Class/Outer, class' tokenstream view and the first token data ahead of time
* Decode token bitfield once and ahead of time
* Reference queues and batch processing
* Introduced bounded queues: ref arrays -> unvalidated refs -> validated (non-null / non-permanent) refs
* Split all these queues for killable vs immutable references
* Stack-living references still handled synchronously. With removal of Class/Outer (prefetched ahead of time) few instances remain outside of ARO calls.
* Outer queues hold 32 items and get flushed when full.
* AddReferencedObjects (ARO) optimizations
* Misc optimizations in many ARO implementations
* New FReferenceCollector API to queue up ARO references (AddStableReference), old sync API (HandleObjectReference) still available
* New AddPropertyReferences traversal that replaces SerializeBin and PropertyValueIterator
* 4.5x faster than PropertyValueIterator
* Uses CLASSCAST dispatch instead of virtual SerializeItem dispatch.
* Step towards new unified token stream replacement shared by class token processing, structs and ARO
* Replaced StructUtil::AddReferencedObjects with AddPropertyReferences traversal, ~8x speedup and collects more references for CitySample
* Parallelism
* Single long-running task per worker
* Improved work-stealing / load-balancing, workers can steal full blocks, ARO calls and initial references
* Queue up slow ARO calls to improve load balancing and avoid late stragglers. Motivated by certain ARO calls taking over 2ms for a few specific objects.
* Kick tasks manually to avoid ParallelFor end synchronization
* FGCObject
* Initial reference collector runs in parallel with mark phase
* New FGCObject constructor API (AddStableNativeReferencesOnly) to opt-in to initial reference collection, used by StreamableManager
* Same constructor API allows FGCObjects to defer registration until they become active (RegisterLater), reduces number of active GCObjects
* Reduced memory usage
* Allocate reached objects in scratch pages (FWorkBlock) and reuse processed blocks, instead of swapping two big TArray<UObject*> per worker
* Reduced sizeof(UClass)
* Shareable token streams
* Misc optimizations
* New API to test if an object is in the permanent object pool. Old API read two global pointers for every visited reference.
* Fixed signed integer usage in GUObjectArray lookup that led to bad codegen
* FPropertyIterator optimizations
* SerializeBin optimizations
* Other changes
* Moved many helpers into UE::GC namespace
* Replaced TFastReferenceCollector API with simplified CollectReferences call. Needed to break this API any way.
* Introduced FGCInternals to avoid forward-declaring TFastReferenceCollector and depend on the options enum in common headers
* Moved and outlined code from GarbageCollection.h / FastReferenceCollector.h to GarbageCollection.cpp
* Moved GC History and Garbage Reference Tracking into a synchronous TDebugReachabilityProcessor
* Removed PersistentGarbage flag since it wasn't used in practice
* Improved const correctness
#rb robert.millar,robert.manuszewski,pj.kack
#preflight 63945bf45624e6da5ec85f88
#jira UE-169791
[CL 23475562 by johan torp in ue5-main branch]
Tested compiling fortnite, unrealeditor, lyra, qagame with non-unity/pch
#preflight 63635997876630122adeab9f
#rb none
[CL 22958990 by henrik karlsson in ue5-main branch]
Moved GetTypeHash function to be hidden friend instead of put directly in global namespace.
Note that the function/operator needs to be fully inlined in the type or placed in the cpp. If the function is added as friend but then implemented outside the type then hidden friend optimization won't work.
This should improve compile time somewhat according to msvc devs.
#rb Steve.Robb
#preflight 6360b7052b5338aceb26471b
[CL 22889837 by henrik karlsson in ue5-main branch]
Before:
3548 unity files
Total CPU Time: 47343.578125 s
Total time in Parallel executor: 494.60 seconds
After:
3445 unity files
Total CPU Time: 46044.671875 s
Total time in Parallel executor: 468.51 seconds
#jira
#preflight 63336159b20e73a098b7f24f
[CL 22218213 by bryan sefcik in ue5-main branch]