#rb Per.Larsson
#rnx
#jira UE-151377
#preflight 628364050039ea57a52d6989
### Virtualization
- [Core.ContentVirtualization] in the engine ini file now supports an array called 'DisabledAsset' which can be used to name asset types that should not virtualize their payloads.
-- By default (in BaseEngine.ini) we have disabled the StaticMesh asset as we know it will crash if a payload is missing and the SoundWave asset as it still is pending testing.
- This new way to disable virtualization is data driven. The older hard coded method has not been removed but will likely be reworked in a future submit.
- Now when an editor bulkdata is adding it's payload to the package trailer builder during package save it will poll the virtualization system with a call to the new method ::IsDisabledForObject by passing in it's owner.
-- If the owner is valid and was present in the 'DisabledAsset' array then the method will return true and the EPayloadFlags::DisableVirtualization flag will be applied.
### Package Trailer
- The pre-existing functionality of enum EPayloadFilter has been moved to a new enum EPayloadStorageType as will only filter payloads based on their storage type.
- EPayloadFilter has been modified to filter payloads based on functionality although at the moment the only thing it can filter for is to return payloads that can be virtualized, it is left for future expansion.
- EPayloadFlags has been reduced to a uint16 with the remaining 2bytes being turned into a new member EPayloadFilterReason.
- This new member allows us to record the exact reason why a payload is excluded from virtualization. If it is zero then the payload can virtualize, otherwise it will contain one or more reasons as to why it is being excluded. For this reason the enum is a bitfield.
- Added overloads of ::GetPayloads and ::GetNumPayloads that take EPayloadFilter rather than a EPayloadStorageType
- Added wrappers around all AccessMode types for FLookupTableEntry.
- FPackageTrailerBuilder has been extended to take a EPayloadFilterReason so that the caller can already provide a reason why the payload cannot be virtualized.
-- As a future peace of work this will probably be changed and we will ask the caller to pass in the owner UObject pointer instead and then we will process the filtering when building the package trailer to a) keep all of the filtering code in one place b) keep the filtering consistent
### PackageSubmissionChecks
- The virtualization process in will now request the payloads that can be virtualized from the package trailer, which respects the new payload flag, rather than requesting all locally stored payloads.
### UObjects
- There is no need for the SoundWave or MeshDescription classes to opt out of virtualization on construction. This will be done when the package is saved and is now data driven rather than being hardcoded.
### DumpPackagePayloadInfo
- The command has been updated to also display the filter reasons applied to each payload
[CL 20240971 by paul chipchase in ue5-main branch]
#rb Per.Larsson
#rnx
#jira UE-151000
#preflight 627271c83b2ed6f85363f53a
- In the old code when submitting packages for virtualization we would load all local payloads into memory then pass them to the virtualization system to be pushed to persistent storage. This works fine for an average users submits but when resaving large projects we can have submits with thousands of packages containing gigabytes of payloads.
- FPushRequest has been changed to accept either a payload directly (in the form of FCompressedBuffer) or accept a reference to a IPayloadProvider which will return the payload when asked.
- Most of the existing backends make no use of the batched pushing feature and only push one payload at a time. The default implementation of this will be to ask each request one at a time for the payload and then pass it to the backend for a single push. If the IPayloadProvider is being used we will therefor only ever have one payload loaded at a time avoiding memory spikes.
- The source control backend currently does implement the batched pushing feature. This backend writes each payload to disk before submitting them to perforce, so all we have to do is make sure that request one payload, write it to disk, then discard the payload to avoid memory spikes.
- This change required that FPushRequest make it's members private and provide accessors to them, as the caller shouldn't need to care if the payload is loaded or comes via the provider.
- NOTE that this implementation is less efficient if we have packages containing many different payloads as we will end up opening the package, parsing the trailer then load a payload many times over, where as the original code would load all payloads from the file in one go. In practice, given the overhead of the source control backend this didn't make a huge difference and is probably not worth the effort to optimize at this point.
[CL 20040609 by paul chipchase in ue5-main branch]
#rb PJ.Kack
#rnx
#preflight 623ae40e7b69b01ec15da75c
#robomerge FNNC
- We need to expose the code that checks a set of packages for non virtualized payloads and virtualizes them so that the stand alone submission tool can access it.
- Currently I am adding this to the IVirtualizationSystem interface to avoid adding additional interface, but we might want to consider moving this some day as it is not a great fit with the rest of the interface.
[CL 19479757 by paul chipchase in ue5-main branch]
#rb PJ.Kack
#rnx
#preflight 62276138671c913c0515c3b4
- To make it easier to extend the number of parameters when initializing the virtualization system, the function has been changed to accept a single struct, FInitParams, which will contains all potential parameters.
- Calling the version of UE::Virtualization::Initialize without any parameters will fallback to using the default values.
- The virtualization manager now passes the provided project name onto each backend that it creates.
- The source control backend now stored the provided project name and uses that when creating the submit description for new payloads rather than polling FApp for the current project name.
-- This is required for the stand alone virtualization application which will not have a specific project set.
[CL 19303638 by paul chipchase in ue5-main branch]
#rb PJ.Kack
#rnx
#preflight 6225cf4cb57e715222ca15aa
- IVirtualizationSystem now has a new ::Initialize method which is called after it is first created. This allows the system to return a proper error when setting up which was not possible when the set up code ran in the constructor.
- The method also takes a FConfigFile reference which should be used when parsing settings from the config file system.
-- The config file being passed to this method, is either the optional config file provided when calling UE::Virtualization::Initialize, or (if that was not set) we find the GEngineIni config file from the global cache to use.
- This should ensure that the entire virtualization system is now loading the settings from the same set of config files where as before the code was a bit inconsistent (loading it in place, using the GConfig global, or trying to find the file from the global cache)
- If the system fails to create we now log if the problem was that we could not find a valid factory for the given system (meaning the module the system lives in was not yet loaded) or if the initialization phase failed.
- FVirtualizationManager now passes a single FConfigFile into all of it's methods that are parsing settings from it.
[CL 19283763 by paul chipchase in ue5-main branch]
#rb Per.Larsson
#rnx
#preflight 61fabeb8ad2ae6c3b763a2b9
- Renamed to ::QueryPayloadStatuses since the output data is in the form of an array of FPayloadStatus
- The method now returns EQueryResult rather than bool, which has 'Success' as value 0, all other values indicate an error.
-- This can be expanded in the future if we start to provide more info from backends as to what actually failed.
- FPayloadStatus has been cleaned up so that 'Partial' is not 'FoundPartial' to give a better indication that the payload was found in at least one storage backend, but not all of them.
[CL 18861036 by paul chipchase in ue5-main branch]
#rb Per.Larsson, Devin.Doucette
#jira UE-133497
#rnx
#preflight 61f835491c5ac5523462810a
- A lot of files have been touched but most of this is changing FPayload::IsValid to !FIoHash::IsZero, the main changes are in EditorBulkData/EditorBulkDataTests
- The only difference between FPayloadId and FIoHash is that the former considered the hash of an invalid or empty buffer to be 'invalid' too where as FIoHash would return a valid hash
-- To keep behaviour the same, we only hash payloads in EditorBulkData using a utility method ::HashBuffer which will not hash empty or invalid payloads and return a default FIoHash instead.
-- Unit tests have been extended to prove that the behaviour has not changed.
#ROBOMERGE-AUTHOR: paul.chipchase
#ROBOMERGE-SOURCE: CL 18806362 in //UE5/Release-5.0/... via CL 18808527 via CL 18821790
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v908-18788545)
[CL 18822154 by paul chipchase in ue5-main branch]
#rb PJ.Kack
#rnx
#jira UE-136758
#preflight 61f3e3626b5aea38e5b4cab7
- Package filtering was disabled when the api was changed to take batches of payloads to push to the backend as the context per payload had to be changed to a raw FString rather than FPackagePath.
- We now convert each context to a FPackagePath and if it is valid test against filtering, if it is not valid we assume that the payload did not come from a package.
- Still tempted to move filtering out of virtualization manager entirely and move it to PackageSubmissionChecks, but that can always be done at a later date.
- Retested the filtering system, plus the ability to add filters to the config files for GameFeature plugins.
#ROBOMERGE-AUTHOR: paul.chipchase
#ROBOMERGE-SOURCE: CL 18776038 in //UE5/Release-5.0/... via CL 18777356 via CL 18777639
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v903-18687472)
[CL 18777650 by paul chipchase in ue5-main branch]
#rb PJ.Kack
#rnx
#jira UE-140366
#preflight 61f2a6f752396dbfeeede332
- Add a way to poll the virtualization system to see if pushing to a backend storage type is possible or not.
- If we cannot push to persistent backend storage when submitting packages to source control we just log in verbose mode and return rather than giving an error. Submitting a non-virtualized package is no longer the problem it once was.
#ROBOMERGE-AUTHOR: paul.chipchase
#ROBOMERGE-SOURCE: CL 18769119 in //UE5/Release-5.0/... via CL 18769125 via CL 18769151
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v903-18687472)
[CL 18769154 by paul chipchase in ue5-main branch]
#rb trivial
#rnx
#preflight
- When FPayloadId::ToString was switched to LexToString we changed from outputting an empty string for invalid ids to an id that is all zeros. However the stringbuilder path was not updated to do the same and so the tests failed.
#ROBOMERGE-AUTHOR: paul.chipchase
#ROBOMERGE-SOURCE: CL 18752188 in //UE5/Release-5.0/... via CL 18752212 via CL 18752288
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v903-18687472)
[CL 18752291 by paul chipchase in ue5-main branch]
#rb PJ.Kack
#jira UE-136126
#rnx
#preflight
### VirtualizationSystem
- Added a new overload for Push to VirtualizationSystem that takes an array of FPushRequest, which is a new structure representing a single payload request.
- Filtering by package name is currently disabled, this is because the API has been forced into changing and passing the package name in via a FString rather than FPackagePath which means we would need to be more careful. This will be done in a future submit.
- The backend interface has been extended to also have a batch version of PushData, by default this will attempt to submit each request one at a time so payloads don't have to try and implement a batched version if there is no need.
- The context being passed with a payload when being pushed has been changed from FPackagePath to FString due to include order issues, as the FPackagePath lives in CoreUObject and the API for virtualization lives in Core. Additionally in the future the payloads might not be owned by a package (there is nothing specifically enforcing this) so the context being a string makes more sense.
- NOTE: Due to the context change we currently no longer support the filtering feature, which allows for payloads belonging to packages under specific directories to be excluded from virtualization. This is something that will be solved in a future submit.
### SourceControlBackend
- Now that we can submit multiple payloads in the same submit, the CL description has been changed slightly. We will now print a list of payload identifiers -> the package trying to submit that payload. This will only tell the users which package originally caused the payload to submit. If a user submits a new package at a later date that contains the same payload we will not be updating the description.
### PackageSubmissionChecks
- Converted the submission process to use the new batch push operation in VirtualizationSystem.
-- This means that we do a single push and then have to update the package trailers to convert the now pushed payloads from local to virtualized.
- Added new define UE_PRECHECK_PAYLOAD_STATUS that makes it easy to toggle off the checks to see which payloads need to be submitted to the persistent backend. This is useful to test if it actually helps speed up the overall operations or if it is faster to just perform the batch push operations on all payloads and check the return values.
-- The hope is that over time the submission processes will become fast enough that we can remove the precheck.
- Fixed up logging to not always assume more than one package or payload.
### General Notes
- Errors and logging is now a bit more vague as we often not just report that X payloads failed etc rather than specific payload identifiers. This probably doesn't affect the user too much since those identifiers as fairly meaningless to them anyway.
- The source control submission could be further optimized by first checking the status of the files in thge depot and only then creating/switching workspace etc.
- As currently written, we need to load all of the payloads into memory, then the backends will do what they need (in the case of source control this results in the payloads being written to disk then submitted) which could create quite a large memory spike when submitting a large number of packages.
-- One solution would be to change the batch push API to take a "payload provider" interface and have the payloads requested as needed rather than passing in the FCompressedBuffer directly. This would let us immediately write the payload to disk for submission then discard it from memory, preventing larger spikes. Although it could cause overhead if there are multiple backends being submitted to. Internally we are unlikely to have more than one backend per storage solution so maybe we should just make it a config option?
#ROBOMERGE-AUTHOR: paul.chipchase
#ROBOMERGE-SOURCE: CL 18403735 in //UE5/Release-5.0/... via CL 18403737
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v896-18170469)
[CL 18403738 by paul chipchase in ue5-release-engine-test branch]
#rb PJ.Kack
#rnx
#preflight 61a795773c29b3cf13cd8250
### PackageSubmissionChecks
- Under the old model submitting a large number of packages could be very slow as each package would check each payload that it owns and is currently stored locally one at a time. For the source control backend this created quite a large overhead even when all of the payloads were already virtualized.
- Now we do a pass over the submitted files to find all valid packages that have package trailers with locally stored payloads, gather the payloads into a single list and then query that in one large batch.
- Once we find which payloads are not in permanent storage (note that in the case where a project is using multiple permanent storage solutions, if a payload is missing in one backend it counts as not being in permanent storage) we then attempt to virtualized them.
- Only after all of this is done will we create the truncated copy of each package and then append the updated trailer to each one. In theory doing it in this order this might slightly increase the change of submit failures that occur after virtualization that result in a package never being submitted and orphaned payloads being added to permanent storage, but this will always be a risk.
- Added an assert to fire if we detect a trailer with some virtualized and some local payloads. This should be a supported feature but needs proper testing first before we can allow it. With out current project settings no project should actually encounter this scenario.
- To make the code easier to follow we now early out of the entire check when errors are encountered.
- Added logging at various stages in the process to help show the user that something is happening and make problems easier to identify in the future.
- Notes
-- There is a lot of handling of invalid FPayloads. This is because it is currently possible to add empty payloads to the trailer which is inefficient and wastes space. The trailer will be modified to reject empty payloads in a future update at which point a lot of this handling can be removed.
-- This could've also been solved by not fully rehydrating a package on save by the end user, which will be added as a project setting in a future piece of work, but this approach will solve the edge case when the user does have a large amount of hydrated packages which contain payloads that are already virtualized so it was better to fix that now while we have good test cases for it.
-- We still have scaling problems with large number of package being submitted that do have payloads that need to be virtualized, this will be fixed by extending IVirtualizationSystem::Push to also accept batches of payloads in future work.
-- OnPrePackageSubmission could be broken up into smaller chunks to make the code easier to follow. This will be done after the batch payload submission work is done.
### VirtualizationSystem
- EStorageType has been promoted to enum class.
- Added a new enum FPayloadStatus to be used when querying if a payload exists in a backend storage system or not.
- Add a new method ::DoPayloadsExist which allows the caller to query if one or more payloads exists in the given backend storage system.
### VirtualizationManager
- Implemented ::DoPayloadsExist. First we get the results from each backend in the storage system (which return as true or false from each backend) then total how many backends found the payload in order to set the correct status.
### IVirtualizationBackend
- ::DoesPayloadExist which queries the existence of a single payload has been added to the interface. Most backends already implemented this for private use and if so have had their implementation renamed to match this.
- Also added ::DoPayloadsExist which takes a batch of FpayloadIdsto query. Some backends can deal with a batch of payload ids much more efficiently than one at a time, although the default implementation does call ::DoesPayloadExist for each requested payload.
-- The default implementation prevents every backend from needing to implement the same for loop but does allow backends that can gain from batching to override it.
### VirtualizationSourceControlBackend
- This backend does override ::DoPayloadsExist and implements it's own version as it tends to perform very poorly when not operating on larger batches.
- In this case ::DoesPayloadExist calls back to ::DoPayloadsExist to check each payload rather than implement as specific version.
### PackageTrailer
- The trailer can now be queries to request how many payloads of a given type it contains
#ROBOMERGE-AUTHOR: paul.chipchase
#ROBOMERGE-SOURCE: CL 18339847 in //UE5/Release-5.0/... via CL 18339852
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469)
[CL 18339859 by paul chipchase in ue5-release-engine-test branch]
This represents UE4/Main @18073326, Release-5.0 @18081140 and Dev-PerfTest @18045971
[CL 18081471 by aurel cordonnier in ue5-release-engine-test branch]
This represents UE4/Main @17911760, Release-5.0 @17915875 and Dev-PerfTest @17914035
[CL 17918595 by aurel cordonnier in ue5-release-engine-test branch]
#rb Danny Couture
#rnx
### Problem
- The crash would occur when the texture being since contains a reference to another texture (via composite) which is also in need of syncing on the users machine.
- After the files on disk have been updated the current packages are reloaded but the original version of the texture will still be in memory when the composite texture notifies that it has been reloaded causing a texture compilation job with the older now invalid texture.
-- VirtualizedBulkData relied on out of date packages not trying to access their payloads on disk once the file has been changed, but with the above scenario it was possible. There could also be other paths causing this (even if the access of out of date package payloads at this point is a waste of time)
### Fix
- The virtualized bulkdata objects now reference themselves with the FLinkerLoader if the payload is on disk, just as the old bulkdata system did. When file changing operations are invoked, the FLinkerLoader will inform it's referenced bulkdata to load the payloads off disk and hold them in memory.
-- Note that virtualized bulkdata was trying to avoid this memory bloat by design, but if we cannot be certain that the bulkdata will not try to access the payload after the file changes (as was thought) then we must take the payload into memory.
- Added a log message to VBD if we detect that the loaded payload does not match the VBD members expectations. This log message can either be an error or a fatal error depending on the define 'VBD_CORRUPTED_PAYLOAD_IS_FATAL' (defaulted to on)
- The registration to FLinkerLoader can be easily disabled by setting the define 'VBD_ALLOW_LINKERLOADER_ATTACHMENT' to 0
### Outstanding Issues
- Cloned bulkdata objects (via assignment) do not currently register themselves to the FLinkerLoader (the old bulkdata system did not) if we were to add this then we need to make sure that the FLinkerLoader registration is thread safe. This should be done as a separate work item so that we can unblock the current workflows that this changelist addresses.
-- Note that if we did not update cloned bulkdata to work with this system then we could drop the registration and just iterate over the package's bulkdata and force them to load into memory when required. However that would leave the cloned virtual bulkdata in an uncertain state and is not desirable.
- When saving a package, the virtualized bulkdata objects that it contains will change their members so that future payload access can come from the newly saved files, where as the old bulkdata system would force the payload to remain in memory for the rest of the editor session. If we want to keep this then we will need to try and make sure that the newly updated bulkdata can be registered so that it receives future notifications about the file being changed.
#ROBOMERGE-AUTHOR: paul.chipchase
#ROBOMERGE-SOURCE: CL 17648118 via CL 17648147 via CL 17648161 via CL 17648163 via CL 17648175
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v875-17642767)
#ROBOMERGE[STARSHIP]: UE5-Main
[CL 17648184 by paul chipchase in ue5-release-engine-test branch]
Updating FVirtualizedUntypedBulkData and textures to use the BulkDataRegistry.
BulkDataRegistry: Add get/put accessors for the cached BulkDataList of packages.
EditorDomain: Move ClassDigests into a global variable that can be shared with BulkDataRegistry.
EditorDomain: Improve performance of GetFileSize by fetching metadata only.
Tickable Cook Objects, for systems used by the cooker that need to be ticked.
Implementation of the the BulkDataRegistry that uses the DDC cache for persistent storage of the BulkDataList.
#rb Devin.Doucette, Paul.Chipchase, Zousar.Shaker
#ROBOMERGE-SOURCE: CL 16768772 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v835-16672529)
[CL 16768778 by matt peters in ue5-release-engine-test branch]
- The mirage specific code is disabled behind the define UE_USE_VIRTUALBULKDATA, this means that some code paths in Texture/Mesh are much more complex than they need to be as we support both old and new paths. Once the system has been turned on and confirmed to cause no issues then this will be stripped out.
- SavePackageUtilities.cpp, SavePackage.cpp and SavePackage2.cpp are editgrates rather than integrations as those files have changes in DevCooker that we don't want to bring over immediately.
- Also includes a prototype system for storing bulkdata in a sidecar file in the workspace domain rather than in the .uasset/.umap file which although has been discontinued as part of mirage, will have applications for future work for non-virtualized projects and/or text based assets.
#rb Patrick.Finegan (all changes have been reviewed when submitted to Dev-Cooker)
#tests Cooking and running ShooterGame/Frosty and other sample programs using megascan assets
#rnx
#preflight 608be50d870cf400013ff99d
[CL 16167285 by paul chipchase in ue5-main branch]