Commit Graph

34 Commits

Author SHA1 Message Date
paul chipchase
6a1a0551da Enable UnsafeTypeCastWarningLevel errors for the virtualization module
#rb trivial
#jira none
#rnx
#preflight 6458e2596c35ad81e60f2fa9

- I thought I had already done this but I see nothing in the p4 history, so it is likely that I forgot to actually submit the change.
- Fix some additional warnings found under clang (C5219)

[CL 25370075 by paul chipchase in ue5-main branch]
2023-05-08 09:40:17 -04:00
paul chipchase
107e71c495 Add a way for VA backends to supply additional errors for failed pull operations when calling FVirtualizationManager::OnPayloadPullError
#rb none
#jira UE-181418
#rnx
#preflight 6447c5482804595a04d1e375

- Each pull operation now takes a FText that can be filled in with additional info about any failures that will cause the new error dialog to be displayed.
-- At the moment we only have info to provide from the source control backend if the connection itself failed.
-- In the future we might want to completely overhaul our error logging and display all messages to the user but our logging system is not really built for that level of redirection at the moment.
- Only pass on the error if it came from a persistent backend, we don't need to worry the user about cached backends which are allowed to fail.

[CL 25178752 by paul chipchase in ue5-main branch]
2023-04-25 08:47:06 -04:00
paul chipchase
2fd6372116 Add a new config option 'ForceCachingOnPull' to VA that when true will force backends to re-upload payloads when caching.
#rb none
#jira UE-182205
#preflight 643d2901c947f6523a2f5845

- Most backends support some form of existence check which is run before a push to prevent uploading data that is already there. Sometimes we want to ignore that check (bugs in the backend etc) and need a way to force that to happen.
- The new config option [Core.VirtualizationModule]:ForceCachingOnPull (default false) will do this when set to true.
- The file and DDC backends will respect the flag but for now the source control backend will continue to check to avoid accidental pointless revisions being added.

[CL 25066347 by paul chipchase in ue5-main branch]
2023-04-17 07:45:19 -04:00
paul chipchase
c65eb35e8c Add bitflag enums to IVirtualizationBackend::PullData and IVirtualizationBackend::PushData so that it is easier to specialize behavior in the future.
#rb Per.Larsson
#jira UE-182205
#preflight 643932b2211b661dc4f594c6

- The enums don't contain any entries at the moment, the goal of this change is to reduce the friction that future API changes might have.
- The older versions of PullData/PushData have been deprecated but this won't flag compiler warnings if people have derived their own implementations of IVirtualizationBackend, so in addition to deprecating the methods have been set to 'final' and have checkNoEntry implementations which is the best we can do at the moment.
-- It is unlikely that people have actually done this as the documentation on adding new backend types will only go out in 5.3.

[CL 25039959 by paul chipchase in ue5-main branch]
2023-04-14 10:13:46 -04:00
paul chipchase
31fd4ad8e2 CIS - Code formatting fix, remove newline that was added at the top of the cpp
#rb trivial
#jira none
#rnx
#preflight

[CL 25003922 by paul chipchase in ue5-main branch]
2023-04-12 04:51:15 -04:00
paul chipchase
b0a5ab7783 Change the VA filesystem backend to log the absolute file path of its directory so that it is easier for users to find.
#rb trivial
#jira none
#preflight 64365842d03b1c87dd7ca47d

- When the path being logged is relative, there is sometimes confusion as to which directory it is relative to.

[CL 25003785 by paul chipchase in ue5-main branch]
2023-04-12 04:13:35 -04:00
paul chipchase
b9bdf31e54 The VA filesystem backend no longer requires that the 'Path' option be setup in the config file.
#rb Per.Larsson
#rnx
#preflight 63bec803c543a64b7d38d89f

- If no 'Path' option is found the path will default to the projects saved directory, under a 'VAPayloads' subdirectory
- This works as an okay default if the backend is being used for cached storage.

[CL 23645795 by paul chipchase in ue5-main branch]
2023-01-11 10:56:47 -05:00
paul chipchase
909615801d Fix the VA filesystem backend from failing to serialize pushed payloads correctly.
#rb Per.Larsson
#jira UE-170012
#rnx
#preflight 63724fa79e3bea8079f2ed76

- The original for loop was added at a time where FCompressedBuffer did not support serialization directly, at that time the code worked.
- Now it looks like the FCompressedBuffer returned by the call to ::GetPayload is going out of scope early, which means that the FSharedBuffer ends up zero length (and quite possibly unsafe to use)
- However nowadays FCompressedBuffer has both a operator<< and direct ::Load/::Save methods so we can remove the for loop and simplify the code.

[CL 23133173 by paul chipchase in ue5-main branch]
2022-11-15 03:18:08 -05:00
paul chipchase
2300a007fe Allow the VA file system backend root path to be overriden by an environment variable, allowing users to choose where they want the data to be stored.
#rb Per.Larsson
#jira UE-168458
#rnx
#preflight 635a77d49b2e9c76c7052e41

[CL 22810926 by paul chipchase in ue5-main branch]
2022-10-27 11:33:48 -04:00
paul chipchase
462290f718 Extending the VA pulling API to allow for batch pulls
#rb Per.Larsson
#jira UE-163093
#rnx
#preflight 63526abcae33b04ec1ee4a65

- The caller can now request more than one payload to be pulled froim the system at a time. In theory this will allow backends to pull data quicker.
-- The file system backend works with the new system but makes no attempt to take advantage as the backend is not intended for serious production use.
-- The DDC backend should be taking full advantage of the new batching API
-- Note that although the source control API does attempt to take advantage of batching, the internal source control API implementation is not doing so. This will be addressed in a future submit.
- This does make the pulling logic a bit more complicated in FVirtualizationManager as we need to deal with some payloads being found in the first backend, some in the second and some in the third etc.
-- To help deal with this a new class FPullRequestCollection has been added which abstracts a lot of complexity away.

[CL 22707955 by paul chipchase in ue5-main branch]
2022-10-21 22:27:40 -04:00
paul chipchase
fe9f239ee2 Fix incorrect statistics being reported when virtualizing payloads. We no longer count a payload that was already stored in a backend as being uploaded there.
#rb Per.Larsson
#jira UE-160942
#rnx
#preflight 6336f2ab5c2225fe5f6c08a5

- Removed the status enum from FPushRequest and replaced it with a class that represents the push result instead.
-- This lets us return more context about problems. At the moment it is only being used to return the filter reason if a payload is filtered out. In the future we could return per payload error messages etc.
- Now that we return more detailed info about the payload push we no longer report payloads that were already in the backend as being pushed which could cause some very misleading stats to be shown.
- Changed the backends to use the new results system.
-- Not all paths were correctly setting the state in the file system backend.
-- The source control backend does not set the result in all paths, which will leave the results in the "pending" state. Which will still mark the payload as not uploaded and so is safe to do.
- Removed the pushing of a single method from IVirtualizationBackend as it was only used in one place internally, so we might as well just change that code to call the batch push overload.

[CL 22279127 by paul chipchase in ue5-main branch]
2022-09-30 15:47:04 -04:00
paul chipchase
8d77e84166 Refactor the VA pushing API so that we only need to implement a single path
#rb Per.Larsson
#jira UE-163103
#rnx
#preflight 6318989c2b7fe03eb664e9f0

### VirtualizationSystem/VirtualizationManager
- Added an overload of ::Push taking just one FPUshRequest so that people don't have to keep adding MakeArrayView boiler plate when pushing a single request
- Change the order of the last two parameters for the raw ::Push call as this will group all of the payload specific parameters together and leave the target storage type at the end. It is unlikely that anyone is calling the older version but it has been deprecated for safety.

### IVirtualizationBackend
- The none FPushRequest overload of ::PushData is no longer virtual, it just converts the parameters to FPushRequest and calls that overload instead. In this way we now only have one pushing code path in our backends. We could probably look into removing this overload at this point (since the higher level IVirtualizationSystem will now convert all push requests into a FPushRequest form) but it is not considered worth it at the moment when the simple overload covers our needs.
- Removed EPushResult in favour of just returning true/false for the overall operation.If the caller needs a more detailed breakdown then they will have to use an overload that takes an FPushRequest over raw parameters.
-- At the moment FPushRequest does not contain a full breakdown of what happened, so with this submit we are effectively losing the ability to find out if the payload was already in the backends or not, however the batch version of push was already not returning this info so it is not a big loss. Fixing FPushRequest to return a better break down of what happened will be done in UE-160942
- Removed the none batch Push paths from the source control and ddc backends as they already supported batch pushing.
- File backend needed to be converted to supporting batch pushing, which is pretty much the same code as before except we need to iterate over the container of FPushRequests.
-- The backend does not early out on error as it tends to be quite fast. We might want to consider an official policy for the VA system, if we should early out of errors or not.

[CL 21907558 by paul chipchase in ue5-main branch]
2022-09-08 19:35:36 -04:00
paul chipchase
dc4779a5e3 Add a number of ways for the VA backend connections to be changed to lazy initialize on first use.
#rb Devin.Doucette
#jira UE-161599
#rnx
#preflight 6303c8d65a5d4e4624e7bf52

- There are some use cases that require the VA system to be initialized and configured correctly but would prefer that the backend connections only run if absolutely needed (usually when a payload is pulled or pushed for the first time), this change provides four different ways of doing this:
-- Setting [Core.VirtualizationModule]LazyInitConnections=true in the Engine ini file
-- Setting the define 'UE_VIRTUALIZATION_CONNECTION_LAZY_INIT' to 1 in a programs .target.cs
-- Running with the commandline option -VA-LazyInitConnections
-- Setting the cvar 'VA.LazyInitConnections' to 1 (only works if it is set before the VA system is initialized, changing it mid editor via the console does nothing)
--- Note that after the config file, each setting there only opts into lazy initializing the connections, setting the cvar to 0 for example will not prevent the cmdline from opting in etc.

- In the future we will allow the connection code to run async, so the latency can be hidden behind the editor loading, but for the current use case we are taking the minimal approach.
-- This means we only support the backend being in 3 states. No connection has been made yet, the connection is broken and the connection is working.
-- To keep things simple we only record if we have attempted to connect the backends or not. We don't check individual backends nor do we try to reconnect failed ones etc. This is all scheduled for a future work item.
- If the connections are not initialized when the VA system is, we wait until the first time someone calls one of the virtualization methods that will actually use a connection: Push/Pull/Query
-- We try connecting all of the backends at once, even if they won't be used in the call to keep things simple.
- Only the source control backend makes use of the connection system. The horde storage (http) backend could take advantage too, but it is currently unused and most likely going to just be deleted so there seemed little point updating it.
- If we try to run an operation on an unconnected backend we only log to verbose. This is to maintain existing behaviour where a failed backend would not be mounted at all. This logging will likely be revisited in a future work item.

[CL 21511855 by paul chipchase in ue5-main branch]
2022-08-23 13:01:15 -04:00
UnrealBot
73409369c0 Branch snapshot for CL 21319338
[CL 21319338 in ue5-main branch]
2022-08-10 16:03:37 +00:00
devin doucette
b2a07ea03e DDC: Merge from UE5/Main
#preflight 6288ff678828ea88c8af7034
#preflight 628ab5d93246d5019db76ed2
#rb none
#rnx

#ROBOMERGE-OWNER: devin.doucette
#ROBOMERGE-AUTHOR: Devin.Doucette
#ROBOMERGE-SOURCE: CL 20353148 via CL 20353832 via CL 20353839
#ROBOMERGE-BOT: UE5 (Release-Engine-Staging -> Main) (v948-20297126)

[CL 20355348 by devin doucette in ue5-main branch]
2022-05-24 16:40:25 -04:00
paul chipchase
84d1e56ab8 Extend VA.MissBackends commandline to work via console commands as well.
#rb PJ.Kack
#jira UE-148223
#rnx
#preflight 6260f6e4360b45c32a84fae8

- The commandline -VA-MissBackends has been extended to a console command 'VA.MissBackends' that can be used at runtime to disable specific backends or reset the list and enable all of them again.
-- Entering VA.MissBackends without any args will print help documentation to the console/output log and show the user the names of the backends that they can disable.
- Technically we don't need the command line parsing anymore and users could use -dpcvars=VA.MissBackends=X instead but since we already have the code we might as well leave it untouched.
- Disabling backends from pulling payloads for debug purposes has been split off to it's own system stored as a

- IVirtualizationBackend::EOperations has been turned into bit flags, so we no longer need the 'both' entry
- IVirtualizationBackendnow has two EOperations members, one to record the operations that the backend supports and another that allows the operations to be disabled for debugging purposes.
- IVirtualizationBackend now has one method to poll if an operation is supported ::IsOperationSupported, which should not change over the lift time of the backend. In addition there is also a method for disabling/enabling a backend for debugging purposes as well as polling the debug state.

[CL 19845175 by paul chipchase in ue5-main branch]
2022-04-21 03:15:13 -04:00
paul chipchase
5844d17c52 Use the 'Log' verbosity when logging the file system backends retry count, there is no need to use 'Display' so that the end user sees it every time.
#rb trivial
#rnx
#preflight 625570ad69015afc27a5d597

[CL 19719467 by paul chipchase in ue5-main branch]
2022-04-12 09:07:12 -04:00
paul chipchase
52d01bd36f The virtualization system may now take the project name as a parameter when it is initialized rather than trying to find it itself.
#rb PJ.Kack
#rnx
#preflight 62276138671c913c0515c3b4

- To make it easier to extend the number of parameters when initializing the virtualization system, the function has been changed to accept a single struct, FInitParams, which will contains all potential parameters.
- Calling the version of UE::Virtualization::Initialize without any parameters will fallback to using the default values.
- The virtualization manager now passes the provided project name onto each backend that it creates.
- The source control backend now stored the provided project name and uses that when creating the submit description for new payloads rather than polling FApp for the current project name.
-- This is required for the stand alone virtualization application which will not have a specific project set.

[CL 19303638 by paul chipchase in ue5-main branch]
2022-03-08 11:22:45 -05:00
paul chipchase
cf3ac36861 Clean up the logging when initializing the virtualization module.
#rb trivial
#jira UE-134434
#rnx
#preflight 620d1a10742ffef420223b15

- Changing a number of log items from 'Log' to 'Display' so that the info shows up more easily.
- Removed a number of log items that added no useful infomation
- Edited some of the existing log items to be clearer.

[CL 19031890 by paul chipchase in ue5-main branch]
2022-02-17 02:53:27 -05:00
paul chipchase
05d81ee7f3 EditorBulkData and the virtualization system now use FIoHash directly and FPayloadId is removed
#rb Per.Larsson, Devin.Doucette
#jira UE-133497
#rnx
#preflight 61f835491c5ac5523462810a

- A lot of files have been touched but most of this is changing FPayload::IsValid to !FIoHash::IsZero, the main changes are in EditorBulkData/EditorBulkDataTests
- The only difference between FPayloadId and FIoHash is that the former considered the hash of an invalid or empty buffer to be 'invalid' too where as FIoHash would return a valid hash
-- To keep behaviour the same, we only hash payloads in EditorBulkData using a utility method ::HashBuffer which will not hash empty or invalid payloads and return a default FIoHash instead.
-- Unit tests have been extended to prove that the behaviour has not changed.

#ROBOMERGE-AUTHOR: paul.chipchase
#ROBOMERGE-SOURCE: CL 18806362 in //UE5/Release-5.0/... via CL 18808527 via CL 18821790
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v908-18788545)

[CL 18822154 by paul chipchase in ue5-main branch]
2022-02-02 02:21:24 -05:00
paul chipchase
0d4d469bbd Add LexToString to FPayloadId and change existing use of ToString to use it.
#rb Per.Larsson
#rnx
#jira UE-133497
#preflight 61f259a3706ac5ea0cf2ee3e

#ROBOMERGE-AUTHOR: paul.chipchase
#ROBOMERGE-SOURCE: CL 18751491 in //UE5/Release-5.0/... via CL 18751500 via CL 18751554
#ROBOMERGE-BOT: UE5 (Release-Engine-Test -> Main) (v903-18687472)

[CL 18751559 by paul chipchase in ue5-main branch]
2022-01-27 04:20:46 -05:00
paul chipchase
21530c814b Packages submitted from the editor together will now virtualize their payloads in a single batch rather than one at a time.
#rb PJ.Kack
#jira UE-136126
#rnx
#preflight

### VirtualizationSystem
- Added a new overload for Push to VirtualizationSystem that takes an array of FPushRequest, which is a new structure representing a single payload request.
- Filtering by package name is currently disabled, this is because the API has been forced into changing and passing the package name in via a FString rather than FPackagePath which means we would need to be more careful. This will be done in a future submit.
- The backend interface has been extended to also have a batch version of PushData, by default this will attempt to submit each request one at a time so payloads don't have to try and implement a batched version if there is no need.
- The context being passed with a payload when being pushed has been changed from FPackagePath to FString due to include order issues, as the FPackagePath lives in CoreUObject and the API for virtualization lives in Core. Additionally in the future the payloads might not be owned by a package (there is nothing specifically enforcing this) so the context being a string makes more sense.
- NOTE: Due to the context change we currently no longer support the filtering feature, which allows for payloads belonging to packages under specific directories to be excluded from virtualization. This is something that will be solved in a future submit.

### SourceControlBackend

- Now that we can submit multiple payloads in the same submit, the CL description has been changed slightly. We will now print a list of payload identifiers -> the package trying to submit that payload. This will only tell the users which package originally caused the payload to submit. If a user submits a new package at a later date that contains the same payload we will not be updating the description.

### PackageSubmissionChecks
- Converted the submission process to use the new batch push operation in VirtualizationSystem.
-- This means that we do a single push and then have to update the package trailers to convert the now pushed payloads from local to virtualized.
- Added new define UE_PRECHECK_PAYLOAD_STATUS that makes it easy to toggle off the checks to see which payloads need to be submitted to the persistent backend.  This is useful to test if it actually helps speed up the overall operations or if it is faster to just perform the batch push operations on all payloads and check the return values.
-- The hope is that over time the submission processes will become fast enough that we can remove the precheck.
- Fixed up logging to not always assume more than one package or payload.

### General Notes
- Errors and logging is now a bit more vague as we often not just report that X payloads failed etc rather than specific payload identifiers. This probably doesn't affect the user too much since those identifiers as fairly meaningless to them anyway.
- The source control submission could be further optimized by first checking the status of the files in thge depot and only then creating/switching workspace etc.
- As currently written, we need to load all of the payloads into memory, then the backends will do what they need (in the case of source control this results in the payloads being written to disk then submitted) which could create quite a large memory spike when submitting a large number of packages.
-- One solution would be to change the batch push API to take a "payload provider" interface and have the payloads requested as needed rather than passing in the FCompressedBuffer directly. This would let us immediately write the payload to disk for submission then discard it from memory, preventing larger spikes. Although it could cause overhead if there are multiple backends being submitted to. Internally we are unlikely to have more than one backend per storage solution so maybe we should just make it a config option?

#ROBOMERGE-AUTHOR: paul.chipchase
#ROBOMERGE-SOURCE: CL 18403735 in //UE5/Release-5.0/... via CL 18403737
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v896-18170469)

[CL 18403738 by paul chipchase in ue5-release-engine-test branch]
2021-12-08 02:19:42 -05:00
devin doucette
27c1393427 CompressedBuffer: Removed partial decompression from FCompressedBuffer now that FCompressedBufferReader is available
Requiring the use of a separate reader type makes it more likely that readers will be reused, and makes it easier to audit reader usage going forward. Reusing readers is desirable to reduce the number of large temporary allocations made during partial decompression of a buffer.

- Added FCompressedBuffer::Save(FArchive&) and renamed FromCompressed(FArchive&) to Load(FArchive&).
- Added FCompressedBufferReaderSourceScope to set a buffer source within a scope.
- Added proper bounds checks to FNoneDecoder.
- Store the header checksum on the decoder context to allow raw blocks to be reused across sources.
- Decode the header on the fly to avoid a temporary header allocation when the header is in contiguous memory.

#rb Zousar.Shaker
#rnx
#preflight 61a98d53800738dbfbc84c73

#ROBOMERGE-AUTHOR: devin.doucette
#ROBOMERGE-SOURCE: CL 18382211 in //UE5/Release-5.0/... via CL 18382310
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v896-18170469)

[CL 18382377 by devin doucette in ue5-release-engine-test branch]
2021-12-06 10:16:05 -05:00
paul chipchase
e38a952142 Submitting packages on projects with virtualization enabled is much faster when none of the payloads actually needs to be virtualized.
#rb PJ.Kack
#rnx
#preflight 61a795773c29b3cf13cd8250

### PackageSubmissionChecks
- Under the old model submitting a large number of packages could be very slow as each package would check each payload that it owns and is currently stored locally one at a time. For the source control backend this created quite a large overhead even when all of the payloads were already virtualized.
- Now we do a pass over the submitted files to find all valid packages that have package trailers with locally stored payloads, gather the payloads into a single list and then query that in one large batch.
- Once we find which payloads are not in permanent storage (note that in the case where a project is using multiple permanent storage solutions, if a payload is missing in one backend it counts as not being in permanent storage) we then attempt to virtualized them.
- Only after all of this is done will we create the truncated copy of each package and then append the updated trailer to each one. In theory doing it in this order this might slightly increase the change of submit failures that occur after virtualization that result in a package never being submitted and orphaned payloads being added to permanent storage, but this will always be a risk.
- Added an assert to fire if we detect a trailer with some virtualized and some local payloads. This should be a supported feature but needs proper testing first before we can allow it. With out current project settings no project should actually encounter this scenario.
- To make the code easier to follow we now early out of the entire check when errors are encountered.
- Added logging at various stages in the process to help show the user that something is happening and make problems easier to identify in the future.
- Notes
--  There is a lot of handling of invalid FPayloads. This is because it is currently possible to add empty payloads to the trailer which is inefficient and wastes space. The trailer will be modified to reject empty payloads in a future update at which point a lot of this handling can be removed.
-- This could've also been solved by not fully rehydrating a package on save by the end user, which will be added as a project setting in a future piece of work, but this approach will solve the edge case when the user does have a large amount of hydrated packages which contain payloads that are already virtualized so it was better to fix that now while we have good test cases for it.
-- We still have scaling problems with large number of package being submitted that do have payloads that need to be virtualized, this will be fixed by extending IVirtualizationSystem::Push to also accept batches of payloads in future work.
-- OnPrePackageSubmission could be broken up into smaller chunks to make the code easier to follow. This will be done after the batch payload submission work is done.

### VirtualizationSystem
- EStorageType has been promoted to enum class.
- Added a new enum FPayloadStatus to be used when querying if a payload exists in a backend storage system or not.
- Add a new method ::DoPayloadsExist which allows the caller to query if one or more payloads exists in the given backend storage system.

### VirtualizationManager
- Implemented ::DoPayloadsExist. First we get the results from each backend in the storage system (which return as true or false from each backend) then total how many backends found the payload in order to set the correct status.

### IVirtualizationBackend
- ::DoesPayloadExist which queries the existence of a single payload has been added to the interface. Most backends already implemented this for private use and if so have had their implementation renamed to match this.
- Also added ::DoPayloadsExist which takes a batch of FpayloadIdsto query. Some backends can deal with a batch of payload ids much more efficiently than one at a time, although the default implementation does call ::DoesPayloadExist for each requested payload.
-- The default implementation prevents every backend from needing to implement the same for loop but does allow backends that can gain from batching to override it.

### VirtualizationSourceControlBackend
- This backend does override ::DoPayloadsExist and implements it's own version as it tends to perform very poorly when not operating on larger batches.
- In this case ::DoesPayloadExist calls back to ::DoPayloadsExist to check each payload rather than implement as specific version.

### PackageTrailer
- The trailer can now be queries to request how many payloads of a given type it contains

#ROBOMERGE-AUTHOR: paul.chipchase
#ROBOMERGE-SOURCE: CL 18339847 in //UE5/Release-5.0/... via CL 18339852
#ROBOMERGE-BOT: STARSHIP (Release-Engine-Staging -> Release-Engine-Test) (v895-18170469)

[CL 18339859 by paul chipchase in ue5-release-engine-test branch]
2021-12-01 11:13:31 -05:00
Marc Audy
0c3be2b6ad Merge Release-Engine-Staging to Test @ CL# 18240298
[CL 18241953 by Marc Audy in ue5-release-engine-test branch]
2021-11-18 14:37:34 -05:00