Files
UnrealEngineUWP/Engine/Source/Developer/Virtualization/Public/IVirtualizationBackend.h

256 lines
9.0 KiB
C
Raw Normal View History

// Copyright Epic Games, Inc. All Rights Reserved.
#pragma once
#include "Compression/CompressedBuffer.h"
#include "Containers/StringView.h"
#include "Features/IModularFeature.h"
#include "Features/IModularFeatures.h"
#include "Templates/UniquePtr.h"
Packages submitted from the editor together will now virtualize their payloads in a single batch rather than one at a time. #rb PJ.Kack #jira UE-136126 #rnx #preflight ### VirtualizationSystem - Added a new overload for Push to VirtualizationSystem that takes an array of FPushRequest, which is a new structure representing a single payload request. - Filtering by package name is currently disabled, this is because the API has been forced into changing and passing the package name in via a FString rather than FPackagePath which means we would need to be more careful. This will be done in a future submit. - The backend interface has been extended to also have a batch version of PushData, by default this will attempt to submit each request one at a time so payloads don't have to try and implement a batched version if there is no need. - The context being passed with a payload when being pushed has been changed from FPackagePath to FString due to include order issues, as the FPackagePath lives in CoreUObject and the API for virtualization lives in Core. Additionally in the future the payloads might not be owned by a package (there is nothing specifically enforcing this) so the context being a string makes more sense. - NOTE: Due to the context change we currently no longer support the filtering feature, which allows for payloads belonging to packages under specific directories to be excluded from virtualization. This is something that will be solved in a future submit. ### SourceControlBackend - Now that we can submit multiple payloads in the same submit, the CL description has been changed slightly. We will now print a list of payload identifiers -> the package trying to submit that payload. This will only tell the users which package originally caused the payload to submit. If a user submits a new package at a later date that contains the same payload we will not be updating the description. ### PackageSubmissionChecks - Converted the submission process to use the new batch push operation in VirtualizationSystem. -- This means that we do a single push and then have to update the package trailers to convert the now pushed payloads from local to virtualized. - Added new define UE_PRECHECK_PAYLOAD_STATUS that makes it easy to toggle off the checks to see which payloads need to be submitted to the persistent backend. This is useful to test if it actually helps speed up the overall operations or if it is faster to just perform the batch push operations on all payloads and check the return values. -- The hope is that over time the submission processes will become fast enough that we can remove the precheck. - Fixed up logging to not always assume more than one package or payload. ### General Notes - Errors and logging is now a bit more vague as we often not just report that X payloads failed etc rather than specific payload identifiers. This probably doesn't affect the user too much since those identifiers as fairly meaningless to them anyway. - The source control submission could be further optimized by first checking the status of the files in thge depot and only then creating/switching workspace etc. - As currently written, we need to load all of the payloads into memory, then the backends will do what they need (in the case of source control this results in the payloads being written to disk then submitted) which could create quite a large memory spike when submitting a large number of packages. -- One solution would be to change the batch push API to take a "payload provider" interface and have the payloads requested as needed rather than passing in the FCompressedBuffer directly. This would let us immediately write the payload to disk for submission then discard it from memory, preventing larger spikes. Although it could cause overhead if there are multiple backends being submitted to. Internally we are unlikely to have more than one backend per storage solution so maybe we should just make it a config option? #ROBOMERGE-AUTHOR: paul.chipchase #ROBOMERGE-SOURCE: CL 18403735 in //UE5/Release-5.0/... via CL 18403737 #ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v896-18170469) [CL 18403738 by paul chipchase in ue5-release-engine-test branch]
2021-12-08 02:19:42 -05:00
#include "Virtualization/PayloadId.h"
#include "Virtualization/VirtualizationSystem.h"
namespace UE::Virtualization
{
/** Describes the result of a IVirtualizationBackend::Push operation */
enum class EPushResult
{
/** The push failed, the backend should print an error message to 'LogVirtualization'.*/
Failed = 0,
/** The payload already exists in the backend and does not need to be pushed. */
PayloadAlreadyExisted,
/** The payload was successfully pushed to the backend. */
Success
};
/**
* The interface to derive from to create a new backend implementation.
*
* Note that virtualization backends are instantiated FVirtualizationManager via
* IVirtualizationBackendFactory so each new backend derived from IVirtualizationBackend
* will also need a factory derived from IVirtualizationBackendFactory. You can either do
* this manually or use the helper macro 'UE_REGISTER_VIRTUALIZATION_BACKEND_FACTORY' to
* generate the code for you.
*
*/
class IVirtualizationBackend
{
protected:
/** Enum detailing which operations a backend can support */
enum class EOperations : uint8
{
/** Supports no operations, this should only occur when debug settings are applied */
None = 0,
/** Supports only push operations */
Push,
/** Supports only pull operations */
Pull,
/** Supports both push and pull operations */
Both
};
IVirtualizationBackend(FStringView InConfigName, FStringView InDebugName, EOperations InSupportedOperations)
: SupportedOperations(InSupportedOperations)
, ConfigName(InConfigName)
, DebugName(InDebugName)
{
checkf(InSupportedOperations != EOperations::None, TEXT("Cannot create a backend without supporting at least one type of operation!"));
}
public:
virtual ~IVirtualizationBackend() = default;
/**
* This will be called during the setup of the backend hierarchy. The entry config file
* entry that caused the backend to be created will be passed to the method so that any
* additional settings may be parsed from it.
* Take care to clearly log any error that occurs so that the end user has a clear way
* to fix them.
*
* @param ConfigEntry The entry for the backend from the config ini file that may
* contain additional settings.
* @return Returning false indicates that initialization failed in a way
* that the backend will not be able to function correctly.
*/
virtual bool Initialize(const FString& ConfigEntry) = 0;
/**
* The backend will attempt to store the given payload by what ever method the backend uses.
* NOTE: It is assumed that the virtualization manager will run all appropriate validation
* on the payload and it's id and that the inputs to PushData can be trusted.
*
* @param Id The Id of the payload
* @param Payload A potentially compressed buffer representing the payload
* @return The result of the push operation
*/
Packages submitted from the editor together will now virtualize their payloads in a single batch rather than one at a time. #rb PJ.Kack #jira UE-136126 #rnx #preflight ### VirtualizationSystem - Added a new overload for Push to VirtualizationSystem that takes an array of FPushRequest, which is a new structure representing a single payload request. - Filtering by package name is currently disabled, this is because the API has been forced into changing and passing the package name in via a FString rather than FPackagePath which means we would need to be more careful. This will be done in a future submit. - The backend interface has been extended to also have a batch version of PushData, by default this will attempt to submit each request one at a time so payloads don't have to try and implement a batched version if there is no need. - The context being passed with a payload when being pushed has been changed from FPackagePath to FString due to include order issues, as the FPackagePath lives in CoreUObject and the API for virtualization lives in Core. Additionally in the future the payloads might not be owned by a package (there is nothing specifically enforcing this) so the context being a string makes more sense. - NOTE: Due to the context change we currently no longer support the filtering feature, which allows for payloads belonging to packages under specific directories to be excluded from virtualization. This is something that will be solved in a future submit. ### SourceControlBackend - Now that we can submit multiple payloads in the same submit, the CL description has been changed slightly. We will now print a list of payload identifiers -> the package trying to submit that payload. This will only tell the users which package originally caused the payload to submit. If a user submits a new package at a later date that contains the same payload we will not be updating the description. ### PackageSubmissionChecks - Converted the submission process to use the new batch push operation in VirtualizationSystem. -- This means that we do a single push and then have to update the package trailers to convert the now pushed payloads from local to virtualized. - Added new define UE_PRECHECK_PAYLOAD_STATUS that makes it easy to toggle off the checks to see which payloads need to be submitted to the persistent backend. This is useful to test if it actually helps speed up the overall operations or if it is faster to just perform the batch push operations on all payloads and check the return values. -- The hope is that over time the submission processes will become fast enough that we can remove the precheck. - Fixed up logging to not always assume more than one package or payload. ### General Notes - Errors and logging is now a bit more vague as we often not just report that X payloads failed etc rather than specific payload identifiers. This probably doesn't affect the user too much since those identifiers as fairly meaningless to them anyway. - The source control submission could be further optimized by first checking the status of the files in thge depot and only then creating/switching workspace etc. - As currently written, we need to load all of the payloads into memory, then the backends will do what they need (in the case of source control this results in the payloads being written to disk then submitted) which could create quite a large memory spike when submitting a large number of packages. -- One solution would be to change the batch push API to take a "payload provider" interface and have the payloads requested as needed rather than passing in the FCompressedBuffer directly. This would let us immediately write the payload to disk for submission then discard it from memory, preventing larger spikes. Although it could cause overhead if there are multiple backends being submitted to. Internally we are unlikely to have more than one backend per storage solution so maybe we should just make it a config option? #ROBOMERGE-AUTHOR: paul.chipchase #ROBOMERGE-SOURCE: CL 18403735 in //UE5/Release-5.0/... via CL 18403737 #ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v896-18170469) [CL 18403738 by paul chipchase in ue5-release-engine-test branch]
2021-12-08 02:19:42 -05:00
virtual EPushResult PushData(const FPayloadId& Id, const FCompressedBuffer& Payload, const FString& PackageContext) = 0;
virtual bool PushData(TArrayView<FPushRequest> Requests)
{
// TODO: Sort return codes
for (FPushRequest& Request : Requests)
{
EPushResult Result = PushData(Request.Identifier, Request.Payload, Request.Context);
switch (Result)
{
case EPushResult::Failed:
return false;
case EPushResult::PayloadAlreadyExisted:
case EPushResult::Success:
break;
default:
checkNoEntry();
break;
}
}
return true;
}
/**
* The backend will attempt to retrieve the given payload by what ever method the backend uses.
* NOTE: It is assumed that the virtualization manager will validate the returned payload to
* make sure that it matches the requested id so there is no need for each backend to do this/
*
* @param Id The Id of a payload to try and pull from the backend.
*
* @return A valid FCompressedBuffer containing the payload if the pull
* operation succeeded and a null FCompressedBuffer
* if it did not.
*/
virtual FCompressedBuffer PullData(const FPayloadId& Id) = 0;
Submitting packages on projects with virtualization enabled is much faster when none of the payloads actually needs to be virtualized. #rb PJ.Kack #rnx #preflight 61a795773c29b3cf13cd8250 ### PackageSubmissionChecks - Under the old model submitting a large number of packages could be very slow as each package would check each payload that it owns and is currently stored locally one at a time. For the source control backend this created quite a large overhead even when all of the payloads were already virtualized. - Now we do a pass over the submitted files to find all valid packages that have package trailers with locally stored payloads, gather the payloads into a single list and then query that in one large batch. - Once we find which payloads are not in permanent storage (note that in the case where a project is using multiple permanent storage solutions, if a payload is missing in one backend it counts as not being in permanent storage) we then attempt to virtualized them. - Only after all of this is done will we create the truncated copy of each package and then append the updated trailer to each one. In theory doing it in this order this might slightly increase the change of submit failures that occur after virtualization that result in a package never being submitted and orphaned payloads being added to permanent storage, but this will always be a risk. - Added an assert to fire if we detect a trailer with some virtualized and some local payloads. This should be a supported feature but needs proper testing first before we can allow it. With out current project settings no project should actually encounter this scenario. - To make the code easier to follow we now early out of the entire check when errors are encountered. - Added logging at various stages in the process to help show the user that something is happening and make problems easier to identify in the future. - Notes -- There is a lot of handling of invalid FPayloads. This is because it is currently possible to add empty payloads to the trailer which is inefficient and wastes space. The trailer will be modified to reject empty payloads in a future update at which point a lot of this handling can be removed. -- This could've also been solved by not fully rehydrating a package on save by the end user, which will be added as a project setting in a future piece of work, but this approach will solve the edge case when the user does have a large amount of hydrated packages which contain payloads that are already virtualized so it was better to fix that now while we have good test cases for it. -- We still have scaling problems with large number of package being submitted that do have payloads that need to be virtualized, this will be fixed by extending IVirtualizationSystem::Push to also accept batches of payloads in future work. -- OnPrePackageSubmission could be broken up into smaller chunks to make the code easier to follow. This will be done after the batch payload submission work is done. ### VirtualizationSystem - EStorageType has been promoted to enum class. - Added a new enum FPayloadStatus to be used when querying if a payload exists in a backend storage system or not. - Add a new method ::DoPayloadsExist which allows the caller to query if one or more payloads exists in the given backend storage system. ### VirtualizationManager - Implemented ::DoPayloadsExist. First we get the results from each backend in the storage system (which return as true or false from each backend) then total how many backends found the payload in order to set the correct status. ### IVirtualizationBackend - ::DoesPayloadExist which queries the existence of a single payload has been added to the interface. Most backends already implemented this for private use and if so have had their implementation renamed to match this. - Also added ::DoPayloadsExist which takes a batch of FpayloadIdsto query. Some backends can deal with a batch of payload ids much more efficiently than one at a time, although the default implementation does call ::DoesPayloadExist for each requested payload. -- The default implementation prevents every backend from needing to implement the same for loop but does allow backends that can gain from batching to override it. ### VirtualizationSourceControlBackend - This backend does override ::DoPayloadsExist and implements it's own version as it tends to perform very poorly when not operating on larger batches. - In this case ::DoesPayloadExist calls back to ::DoPayloadsExist to check each payload rather than implement as specific version. ### PackageTrailer - The trailer can now be queries to request how many payloads of a given type it contains #ROBOMERGE-AUTHOR: paul.chipchase #ROBOMERGE-SOURCE: CL 18339847 in //UE5/Release-5.0/... via CL 18339852 #ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469) [CL 18339859 by paul chipchase in ue5-release-engine-test branch]
2021-12-01 11:13:31 -05:00
/**
* Checks if a payload exists in the backends storage.
*
* @param Id The identifier of the payload to check
*
* @return True if the backend storage already contains the payload, otherwise false
*/
virtual bool DoesPayloadExist(const FPayloadId& Id) = 0;
/**
* Checks if a number of payload exists in the backends storage.
*
* @param[in] PayloadIds An array of FPayloadId that should be checked
* @param[out] OutResults An array to contain the result, true if the payload
* exists in the backends storage, false if not.
* This array will be resized to match the size of PayloadIds.
*
* @return True if the operation completed without error, otherwise false
*/
virtual bool DoPayloadsExist(TArrayView<const FPayloadId> PayloadIds, TArray<bool>& OutResults)
{
// This is the default implementation that just calls ::DoesExist on each FPayloadId in the
// array, one at a time.
// Backends may override this with their own implementations if it can be done with less
// overhead by performing the check on the entire batch instead.
Submitting packages on projects with virtualization enabled is much faster when none of the payloads actually needs to be virtualized. #rb PJ.Kack #rnx #preflight 61a795773c29b3cf13cd8250 ### PackageSubmissionChecks - Under the old model submitting a large number of packages could be very slow as each package would check each payload that it owns and is currently stored locally one at a time. For the source control backend this created quite a large overhead even when all of the payloads were already virtualized. - Now we do a pass over the submitted files to find all valid packages that have package trailers with locally stored payloads, gather the payloads into a single list and then query that in one large batch. - Once we find which payloads are not in permanent storage (note that in the case where a project is using multiple permanent storage solutions, if a payload is missing in one backend it counts as not being in permanent storage) we then attempt to virtualized them. - Only after all of this is done will we create the truncated copy of each package and then append the updated trailer to each one. In theory doing it in this order this might slightly increase the change of submit failures that occur after virtualization that result in a package never being submitted and orphaned payloads being added to permanent storage, but this will always be a risk. - Added an assert to fire if we detect a trailer with some virtualized and some local payloads. This should be a supported feature but needs proper testing first before we can allow it. With out current project settings no project should actually encounter this scenario. - To make the code easier to follow we now early out of the entire check when errors are encountered. - Added logging at various stages in the process to help show the user that something is happening and make problems easier to identify in the future. - Notes -- There is a lot of handling of invalid FPayloads. This is because it is currently possible to add empty payloads to the trailer which is inefficient and wastes space. The trailer will be modified to reject empty payloads in a future update at which point a lot of this handling can be removed. -- This could've also been solved by not fully rehydrating a package on save by the end user, which will be added as a project setting in a future piece of work, but this approach will solve the edge case when the user does have a large amount of hydrated packages which contain payloads that are already virtualized so it was better to fix that now while we have good test cases for it. -- We still have scaling problems with large number of package being submitted that do have payloads that need to be virtualized, this will be fixed by extending IVirtualizationSystem::Push to also accept batches of payloads in future work. -- OnPrePackageSubmission could be broken up into smaller chunks to make the code easier to follow. This will be done after the batch payload submission work is done. ### VirtualizationSystem - EStorageType has been promoted to enum class. - Added a new enum FPayloadStatus to be used when querying if a payload exists in a backend storage system or not. - Add a new method ::DoPayloadsExist which allows the caller to query if one or more payloads exists in the given backend storage system. ### VirtualizationManager - Implemented ::DoPayloadsExist. First we get the results from each backend in the storage system (which return as true or false from each backend) then total how many backends found the payload in order to set the correct status. ### IVirtualizationBackend - ::DoesPayloadExist which queries the existence of a single payload has been added to the interface. Most backends already implemented this for private use and if so have had their implementation renamed to match this. - Also added ::DoPayloadsExist which takes a batch of FpayloadIdsto query. Some backends can deal with a batch of payload ids much more efficiently than one at a time, although the default implementation does call ::DoesPayloadExist for each requested payload. -- The default implementation prevents every backend from needing to implement the same for loop but does allow backends that can gain from batching to override it. ### VirtualizationSourceControlBackend - This backend does override ::DoPayloadsExist and implements it's own version as it tends to perform very poorly when not operating on larger batches. - In this case ::DoesPayloadExist calls back to ::DoPayloadsExist to check each payload rather than implement as specific version. ### PackageTrailer - The trailer can now be queries to request how many payloads of a given type it contains #ROBOMERGE-AUTHOR: paul.chipchase #ROBOMERGE-SOURCE: CL 18339847 in //UE5/Release-5.0/... via CL 18339852 #ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469) [CL 18339859 by paul chipchase in ue5-release-engine-test branch]
2021-12-01 11:13:31 -05:00
OutResults.SetNum(PayloadIds.Num());
for (int32 Index = 0; Index < PayloadIds.Num(); ++Index)
{
OutResults[Index] = DoesPayloadExist(PayloadIds[Index]);
}
return true;
}
/** Used when debugging to disable the pull operation */
void DisablePullOperationSupport()
{
if (SupportedOperations == EOperations::Pull)
{
SupportedOperations = EOperations::None;
}
else if (SupportedOperations == EOperations::Both)
{
SupportedOperations = EOperations::Push;
}
}
/** Return true if the backend supports push operations. Returning true allows ::PushData to be called. */
bool SupportsPushOperations() const
{
return SupportedOperations == EOperations::Push || SupportedOperations == EOperations::Both;
}
/** Return true if the backend supports pull operations. Returning true allows ::PullData to be called. */
bool SupportsPullOperations() const
{
return SupportedOperations == EOperations::Pull || SupportedOperations == EOperations::Both;
}
/** Returns a string containing the name of the backend as it appears in the virtualization graph in the config file */
const FString& GetConfigName() const
{
return ConfigName;
}
/** Returns a string that can be used to identify the backend for debugging and logging purposes */
const FString& GetDebugName() const
{
return DebugName;
}
private:
/** The operations that this backend supports */
EOperations SupportedOperations;
/** The name assigned to the backend by the virtualization graph */
FString ConfigName;
/** Combination of the backend type and the name used to create it in the virtualization graph */
FString DebugName;
};
/**
* Derive from this interface to implement a factory to return a backend type.
* An instance of the factory should be created and then registered to
* IModularFeatures with the feature name "VirtualizationBackendFactory" to
* give 'FVirtualizationManager' access to it.
* The macro 'UE_REGISTER_VIRTUALIZATION_BACKEND_FACTORY' can be used to create
* a factory easily if you do not want to specialize the behaviour.
*/
class IVirtualizationBackendFactory : public IModularFeature
{
public:
/**
* Creates a new backend instance.
*
* @param ConfigName The name given to the back end in the config ini file
* @return A new backend instance
*/
virtual TUniquePtr<IVirtualizationBackend> CreateInstance(FStringView ConfigName) = 0;
/** Returns the name used to identify the type in config ini files */
virtual FName GetName() = 0;
};
/**
* This macro is used to generate a backend factories boilerplate code if you do not
* need anything more than the default behavior.
* As well as creating the class, a single instance will be created which will register the factory with
* 'IModularFeatures' so that it is ready for use.
*
* @param BackendClass The name of the class derived from 'IVirtualizationBackend' that the factory should create
* @param The name used in config ini files to reference this backend type.
*/
#define UE_REGISTER_VIRTUALIZATION_BACKEND_FACTORY(BackendClass, ConfigName) \
class BackendClass##Factory : public IVirtualizationBackendFactory \
{ \
public: \
BackendClass##Factory() { IModularFeatures::Get().RegisterModularFeature(FName("VirtualizationBackendFactory"), this); }\
virtual ~BackendClass##Factory() { IModularFeatures::Get().UnregisterModularFeature(FName("VirtualizationBackendFactory"), this); } \
private: \
virtual TUniquePtr<IVirtualizationBackend> CreateInstance(FStringView ConfigName) override { return MakeUnique<BackendClass>(ConfigName, WriteToString<256>(#ConfigName, TEXT(" - "), ConfigName).ToString()); } \
virtual FName GetName() override { return FName(#ConfigName); } \
}; \
static BackendClass##Factory BackendClass##Factory##Instance;
} // namespace UE::Virtualization