Files
UnrealEngineUWP/Engine/Source/Developer/Virtualization/Private/VirtualizationDDCBackend.h

59 lines
1.9 KiB
C
Raw Normal View History

// Copyright Epic Games, Inc. All Rights Reserved.
#pragma once
#include "IVirtualizationBackend.h"
#include "DerivedDataCacheKey.h"
namespace UE::DerivedData { enum class ECachePolicy : uint32; }
namespace UE::Virtualization
{
/**
* A backend that uses the DDC2 as it's storage mechanism. It is intended to be used as a local caching
* system to speed up operations rather than for use as persistent storage.
*
* Ini file setup:
* 'Name'=(Type=DDCBackend, Bucket="XXX", LocalStorage=True/False, RemoteStorage=True/False)
*
* Required Values:
* 'Name': The backend name in the hierarchy.
*
* Optional Values:
* Bucket: An alphanumeric identifier used to group the payloads together in storage. [Default="BulkData"]
* LocalStorage: When set to true, the payloads can be stored locally. [Default=true]
* RemoteStorage: When set to true, the payloads can be stored remotely. [Default=true]
*/
class FDDCBackend final : public IVirtualizationBackend
{
public:
explicit FDDCBackend(FStringView ProjectName, FStringView ConfigName, FStringView InDebugName);
virtual ~FDDCBackend() = default;
private:
/* IVirtualizationBackend implementation */
virtual bool Initialize(const FString& ConfigEntry) override;
Add a number of ways for the VA backend connections to be changed to lazy initialize on first use. #rb Devin.Doucette #jira UE-161599 #rnx #preflight 6303c8d65a5d4e4624e7bf52 - There are some use cases that require the VA system to be initialized and configured correctly but would prefer that the backend connections only run if absolutely needed (usually when a payload is pulled or pushed for the first time), this change provides four different ways of doing this: -- Setting [Core.VirtualizationModule]LazyInitConnections=true in the Engine ini file -- Setting the define 'UE_VIRTUALIZATION_CONNECTION_LAZY_INIT' to 1 in a programs .target.cs -- Running with the commandline option -VA-LazyInitConnections -- Setting the cvar 'VA.LazyInitConnections' to 1 (only works if it is set before the VA system is initialized, changing it mid editor via the console does nothing) --- Note that after the config file, each setting there only opts into lazy initializing the connections, setting the cvar to 0 for example will not prevent the cmdline from opting in etc. - In the future we will allow the connection code to run async, so the latency can be hidden behind the editor loading, but for the current use case we are taking the minimal approach. -- This means we only support the backend being in 3 states. No connection has been made yet, the connection is broken and the connection is working. -- To keep things simple we only record if we have attempted to connect the backends or not. We don't check individual backends nor do we try to reconnect failed ones etc. This is all scheduled for a future work item. - If the connections are not initialized when the VA system is, we wait until the first time someone calls one of the virtualization methods that will actually use a connection: Push/Pull/Query -- We try connecting all of the backends at once, even if they won't be used in the call to keep things simple. - Only the source control backend makes use of the connection system. The horde storage (http) backend could take advantage too, but it is currently unused and most likely going to just be deleted so there seemed little point updating it. - If we try to run an operation on an unconnected backend we only log to verbose. This is to maintain existing behaviour where a failed backend would not be mounted at all. This logging will likely be revisited in a future work item. [CL 21511855 by paul chipchase in ue5-main branch]
2022-08-23 13:01:15 -04:00
virtual EConnectionStatus OnConnect() override;
virtual bool PushData(TArrayView<FPushRequest> Requests) override;
virtual bool PullData(TArrayView<FPullRequest> Requests) override;
Submitting packages on projects with virtualization enabled is much faster when none of the payloads actually needs to be virtualized. #rb PJ.Kack #rnx #preflight 61a795773c29b3cf13cd8250 ### PackageSubmissionChecks - Under the old model submitting a large number of packages could be very slow as each package would check each payload that it owns and is currently stored locally one at a time. For the source control backend this created quite a large overhead even when all of the payloads were already virtualized. - Now we do a pass over the submitted files to find all valid packages that have package trailers with locally stored payloads, gather the payloads into a single list and then query that in one large batch. - Once we find which payloads are not in permanent storage (note that in the case where a project is using multiple permanent storage solutions, if a payload is missing in one backend it counts as not being in permanent storage) we then attempt to virtualized them. - Only after all of this is done will we create the truncated copy of each package and then append the updated trailer to each one. In theory doing it in this order this might slightly increase the change of submit failures that occur after virtualization that result in a package never being submitted and orphaned payloads being added to permanent storage, but this will always be a risk. - Added an assert to fire if we detect a trailer with some virtualized and some local payloads. This should be a supported feature but needs proper testing first before we can allow it. With out current project settings no project should actually encounter this scenario. - To make the code easier to follow we now early out of the entire check when errors are encountered. - Added logging at various stages in the process to help show the user that something is happening and make problems easier to identify in the future. - Notes -- There is a lot of handling of invalid FPayloads. This is because it is currently possible to add empty payloads to the trailer which is inefficient and wastes space. The trailer will be modified to reject empty payloads in a future update at which point a lot of this handling can be removed. -- This could've also been solved by not fully rehydrating a package on save by the end user, which will be added as a project setting in a future piece of work, but this approach will solve the edge case when the user does have a large amount of hydrated packages which contain payloads that are already virtualized so it was better to fix that now while we have good test cases for it. -- We still have scaling problems with large number of package being submitted that do have payloads that need to be virtualized, this will be fixed by extending IVirtualizationSystem::Push to also accept batches of payloads in future work. -- OnPrePackageSubmission could be broken up into smaller chunks to make the code easier to follow. This will be done after the batch payload submission work is done. ### VirtualizationSystem - EStorageType has been promoted to enum class. - Added a new enum FPayloadStatus to be used when querying if a payload exists in a backend storage system or not. - Add a new method ::DoPayloadsExist which allows the caller to query if one or more payloads exists in the given backend storage system. ### VirtualizationManager - Implemented ::DoPayloadsExist. First we get the results from each backend in the storage system (which return as true or false from each backend) then total how many backends found the payload in order to set the correct status. ### IVirtualizationBackend - ::DoesPayloadExist which queries the existence of a single payload has been added to the interface. Most backends already implemented this for private use and if so have had their implementation renamed to match this. - Also added ::DoPayloadsExist which takes a batch of FpayloadIdsto query. Some backends can deal with a batch of payload ids much more efficiently than one at a time, although the default implementation does call ::DoesPayloadExist for each requested payload. -- The default implementation prevents every backend from needing to implement the same for loop but does allow backends that can gain from batching to override it. ### VirtualizationSourceControlBackend - This backend does override ::DoPayloadsExist and implements it's own version as it tends to perform very poorly when not operating on larger batches. - In this case ::DoesPayloadExist calls back to ::DoPayloadsExist to check each payload rather than implement as specific version. ### PackageTrailer - The trailer can now be queries to request how many payloads of a given type it contains #ROBOMERGE-AUTHOR: paul.chipchase #ROBOMERGE-SOURCE: CL 18339847 in //UE5/Release-5.0/... via CL 18339852 #ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469) [CL 18339859 by paul chipchase in ue5-release-engine-test branch]
2021-12-01 11:13:31 -05:00
virtual bool DoesPayloadExist(const FIoHash& Id) override;
private:
/** The bucket being used to group together the virtualized payloads in storage */
FString BucketName;
/** The FCacheBucket used with the DDC, cached to avoid recreating it for each request */
UE::DerivedData::FCacheBucket Bucket;
/** The policy to use when uploading or downloading data from the cache */
UE::DerivedData::ECachePolicy TransferPolicy;
/** The policy to use when querying the cache for information */
UE::DerivedData::ECachePolicy QueryPolicy;
};
} // namespace UE::Virtualization