Commit Graph

888 Commits

Author SHA1 Message Date
devin doucette
1821fa9481 DDC: Fix race conditions on counters in the Jupiter and S3 cache stores
#rb Steve.Robb
#rnx

[CL 31262949 by devin doucette in ue5-main branch]
2024-02-07 12:51:56 -05:00
danny couture
8270d4e339 [ZenCacheStore]
- Fix race condition on counters

#rb dan.engelbrecht

[CL 31255886 by danny couture in ue5-main branch]
2024-02-07 09:25:38 -05:00
zousar shaker
081894b3f3 Fix log format convention for HttpCacheStore to match ZenCacheStore - log lines should be prefixed by node name, not domain or URL.
[CL 31158214 by zousar shaker in ue5-main branch]
2024-02-03 02:36:18 -05:00
mark lintott
0db277eb21 Changed Derived Data Cache Usage UI to use the same Resource Stats as Studio Telemetry
The total is already served up as an Asset Type so had to and some logic to separate it in the view.
Changed sorting from Size to Count. I felt that sorting by highest count is more useful as is it clearly indicates the less efficient DDC work.
Added Hit Efficiency to Asset Stats and added this as an extra column to the UI
#rb Devin.Doucette

[CL 31009963 by mark lintott in ue5-main branch]
2024-01-30 12:02:31 -05:00
zousar shaker
662d9b1d71 Adjusting logs related to zen usage to:
- Remove repeated log related to fetching cache stats
- Ensure that the Zen cache usage has log lines explaining the status of the caches it attempts to connect to (success or failure)

#rb Matt.Peters

[CL 30710842 by zousar shaker in ue5-main branch]
2024-01-18 22:43:41 -05:00
devin doucette
e675fcb5b5 DDC: Exposed metadata more extensively in the build API
#rb Zousar.Shaker

[CL 30691096 by devin doucette in ue5-main branch]
2024-01-18 14:06:50 -05:00
aris theophanidis
8ae2292282 Remove Compression.h from CoreMinimal.h
It's about 1/4 of CoreMinimal.h but rarely needed (Compression.h pulls on CriticalSection.h and Map.h that are costly).
#rb Yoan.StAmant

[CL 30683417 by aris theophanidis in ue5-main branch]
2024-01-18 09:56:55 -05:00
steve robb
66266c6a11 Fixed up DerivedDataCache, DesktopPlatform, ApplicationCore, AssetRegistry, Core, CoreUObject, Projects, Sockets code to use EAllowShrinking instead of bools.
[CL 30676428 by steve robb in ue5-main branch]
2024-01-17 19:51:06 -05:00
marc audy
19e84555b3 Silence PVS warnings
[CL 30653812 by marc audy in ue5-main branch]
2024-01-17 01:34:02 -05:00
zousar shaker
e7e33882d3 Change the internal use of cache flags in the DDC hierarchy to have the flag usage be atomic and avoid needing to take a write lock of the nodes lock when the flags are being set. This is to avoid an issue where recursive read locks can lead to deadlocks in the presence of a write lock. A future change is planned to alter the locks to ones that support recursive use, at which point this change will no longer be necessary.
#rb Devin.Doucette

[CL 30597063 by zousar shaker in ue5-main branch]
2024-01-12 12:47:57 -05:00
dan engelbrecht
ba66b561de For local Zen DDC connections show the path where data is store in the Cache Statistics tab for DDC instead of local ip address
For Zen DDC connection show the storage size
#jira UE-199929
#rb Devin.Doucette

[CL 30501101 by dan engelbrecht in ue5-main branch]
2024-01-09 06:08:29 -05:00
zousar shaker
557cdf9e71 Initialize an out variable to satisfy static analysis.
[CL 30297745 by zousar shaker in ue5-main branch]
2023-12-13 13:17:25 -05:00
zousar shaker
ae71242205 Change the way readiness of ZenCacheStore is evaluated and re-evaluated.
Instead of relying on a one-time blocking check with no reponse timeout, we now issue a blocking request with a 5 second idle time limit.  If it fails, the store will still be created, but it will go into the same asyncronous re-evaluation loop as when performance is below the acceptable threshold and re-evaluate at 30 second intervals until both:

- Health is Ok
- Performance criteria (if any has been configured and is in use) is met

At which point it is activated.  Health checks have been changed from the health/status endpoint to health/ready because we don't want to act if the server is running but not ready for requests (eg: during the time when it may be wiping data during a schema change.

The overall goal is that we don't force the entire session to execute without zenserver if zenserver was not ready at startup.

#rb Devin.Doucette

[CL 30295095 by zousar shaker in ue5-main branch]
2023-12-13 11:34:21 -05:00
devin doucette
c7cee256ca DDC: Added a fatal error if the cache is not readable or writable
#rb Zousar.Shaker

[CL 30294521 by devin doucette in ue5-main branch]
2023-12-13 11:07:59 -05:00
zousar shaker
3909b8757e Avoid logging Display verbosity messages when a DDC request to HttpCacheStore is canceled. Also rename ExpectedErrorCodes to ExpectedStatusCodes to avoid conflating the distinct meanings of ErrorCodes and StatusCodes.
[CL 30275083 by zousar shaker in ue5-main branch]
2023-12-12 15:49:30 -05:00
zousar shaker
49d387e6c5 Use the low speed limit with zen cache store client, but with really tolerant threshold (only require 30 bytes over 60 seconds).
#rb dan.engelbrecht, Devin.Doucette

[CL 30265167 by zousar shaker in ue5-main branch]
2023-12-12 11:23:43 -05:00
devin doucette
7f3b930d93 DDC: Temporarily restored blocking during get requests in HttpCacheStore
#rb Zousar.Shaker
#rnx

[CL 30264462 by devin doucette in ue5-main branch]
2023-12-12 11:01:57 -05:00
dan engelbrecht
b30f7932e4 If we find an active parsed note in the ddc graph when creating it, return it instead of returning nullptr.
The main cooker process sets up the environment variable UE-ZenSharedDataCacheHost as part of it's initialization so the ZenShared instance is created by the worker processes when they initialize before the redirection for the file share is detected.

In the main cooker process that env-variable is not set so it skips creating the ZenShared instance until it finds the redirection.

Renamed ICacheStoreGraph::Create -> ICacheStoreGraph::FindOrCreate to better reflect the functionality.

#rb Devin.Doucette, Zousar.Shaker

[CL 30248058 by dan engelbrecht in ue5-main branch]
2023-12-11 15:52:04 -05:00
devin doucette
59dd82d717 DDC: Eliminated most blocking during get requests in HttpCacheStore
This restores the changes from 29016116 that were temporarily reverted.

#rb Zousar.Shaker
#rnx

[CL 30210802 by devin doucette in ue5-main branch]
2023-12-08 10:24:22 -05:00
zousar shaker
93a052ff58 Fix a typo in a log message.
[CL 30100313 by zousar shaker in ue5-main branch]
2023-12-04 18:28:00 -05:00
zousar shaker
a5d833e390 Add EQueuedWorkFlags::DoNotRunInsideBusyWait to tasks started by DDC so that as long running tasks, they don't get picked up by arbitrary busy waits.
#rb Devin.Doucette

[CL 30043718 by zousar shaker in ue5-main branch]
2023-12-01 11:56:56 -05:00
zousar shaker
e1d9e1c096 Don't attempt immediate retry when a 502 error happens during communication with the UE Cloud DDC server.
[FYI] joakim.lindqvist

[CL 30024951 by zousar shaker in ue5-main branch]
2023-11-30 17:03:55 -05:00
zousar shaker
81874f5865 Change the way ZenCacheStore configuration and presentation is implemented to align with the pattern used in HttpCacheStore so that we can use a ServerID parameter to refer to a configuration in the StorageServers section of the config while still retaining the ability to override values as needed. No config changes at this point, those will happen later.
#rb Devin.Doucette

[CL 29926397 by zousar shaker in ue5-main branch]
2023-11-25 16:22:26 -05:00
zousar shaker
9fbac7bd41 Change the strategy for HttpCacheStore Put operations to have larger and distinct queues for each of the put operation phases. Instead of one queue of 24 maximum in-flight requests, we can now have 64 PutRef requests AND 64 PutBlobs requests, and 64 PutFinalize requests operating simultaneously. The purpose is to allow per-shader caching to:
1. Put a higher volume of individual cache items in a lesser amount of wall time
2. Reduce the gap in time between when a ref is put and when it is finalized during times of heavy Put workload

Along with this change there is a System.DerivedDataCache.HttpDerivedDataBackend.CacheStoreStressPut automated test (StressFilter) that puts 1000 records containing 12 byte values in each.  Before this change, the test took 21 seconds to complete.  It now takes 9 seconds to complete.  There are opportunities to improve this further through batching.

#rb Devin.Doucette

[CL 29913201 by zousar shaker in ue5-main branch]
2023-11-23 14:48:23 -05:00
mark lintott
b99d530ce0 Moved EpicStudioAnalytics plugin to StudioTelemetry which is now public facing.
The intention is to provide a set of "in the box" telemetry hooks to our most common developer workflows for use internal Epic use and licensees
Moved startup of StudioTelemetry plugin much earlier to PostConfigInit stage so that sessions can start much earlier in the workflow
Added WIP Client support to telemetry plugin
Added data driven provider support for FAnlayiticProvidersET interfaces via BaseEngine.ini for common games and sample projects. Settings and URLs for EPic are only avaliable in Resttricted/NotForLicensee config folders for Lyra and Shooter game.
Added support for Horde telemetry and fixed up various problems with duplicate attributes being sent.
Added Core.VirtualAssets telemetry event
Added Core.Zen telemetry event
Added Core.IAS telemetry event
Added check for IOStoreOnDemand IsEnabled to avoid sending empty IAS events
Added FAnalyticsMulticastProvider to forward telemetry events to multiple providers contained within
Removed deprecated Fire_LoadingEvent from StudioAnalytics
[FYI] paul.chipchase
#rb Wes.Hunt

[CL 29908039 by mark lintott in ue5-main branch]
2023-11-23 07:06:10 -05:00