Commit Graph

31 Commits

Author SHA1 Message Date
henrik karlsson
7647b785f2 [UBA]
* Added support for strings and comments in config files
* Added tests
* Fixed for uba to work integrated in other programs (proper dll exports)
* Added process id to process info in visualizer

[CL 34364596 by henrik karlsson in ue5-main branch]
2024-06-14 02:35:27 -04:00
henrik karlsson
9cd986abf6 [UBA]
* Fixed macos compile error

[CL 34264292 by henrik karlsson in ue5-main branch]
2024-06-11 00:02:05 -04:00
henrik karlsson
6e7cc07abd [UBA]
* Added so NetworkServer can have multiple active crypto keys that expire at certain time.
* Added http server to cache service that can be used to register crypto keys. Test with "curl http://localhost:80/addcrypto?0123456701234567,1000" (key will be valid for 1000 seconds)

[CL 34264112 by henrik karlsson in ue5-main branch]
2024-06-10 23:37:00 -04:00
henrik karlsson
955a26ecbf [UBA]
* Fixed so visualizer can visualize traces coming from UbaCli without needing restart
* Added some more logging to cache server maintenance
* Fixed scroll wheel zoom in visualizer
* Fixed so network server DisconnectClients does not put server in a bad state

[CL 34213980 by henrik karlsson in ue5-main branch]
2024-06-07 16:29:55 -04:00
henrik karlsson
5c693e5f22 [UBA]
* Added support for adding high priorty work
* Changed so cache server puts bucket maintenance work at high priority

[CL 34058891 by henrik karlsson in ue5-main branch]
2024-06-02 18:15:45 -04:00
henrik karlsson
fdd2fd52c4 [UBA]
* Renamed system stats to kernel stats
* Changed all stats storing to use bitfield first to say which fields that are non-zero

[CL 34058881 by henrik karlsson in ue5-main branch]
2024-06-02 18:14:50 -04:00
henrik karlsson
adfccca28f [UBA]
* Fixed bug in text normalization code.. was missing one character
* Fixed so cache processes are colored differently in visualizer
* Improved assert information when pressing ctrl-c during build and an assert triggers

[CL 32957554 by henrik karlsson in ue5-main branch]
2024-04-13 19:21:42 -04:00
henrik karlsson
06a0a5537c [Uba]
* Bumped table sizes to prevent oom when building our largest non unity builds
* Added bucket support in cache system
* Added ParallelFor function to WorkManager
* Fixed so annoying logging when pressing ctrl-c is muted

[CL 32947130 by henrik karlsson in ue5-main branch]
2024-04-12 18:04:58 -04:00
henrik karlsson
5dc54a8104 [UBA]
* Huge stability push... all (known) paths have been tested heavily on linux with tsan and every single found race condition report has been fixed. Lots of locks have been added/moved/changed and some instances of things have been leaked on purpose to prevent tsan reports during shutdown
* More efficient storage proxy implementation which immediately forward segments to clients once they are available in proxy
* Added UbaAgent -command=x which can be used to send commands to host. Supported commands are "status"which prints out status of all remote sessions. "abort/abortproxy/abortnonproxy" that can be used to abort remote sessions and "disableremote" to have host stop accepting more helpers
* Fixed scheduler::stop bug if remotes were still requesting processes
* Added support for process reuse on linux/macos
* Added support for Coordinator interface and dynamically load coordinator dll in UbaCli
* Restructured code a little bit to be able to queue up all writes in parallel
* Added Show create/write colors to visualizer (defaults to on)
* Fixed so write file times are visualized in visualizer
* Improved socket path for visualizer
* Improved a lot of error messages
* Fixed double close of memory handle in StorageServer
* Changed some ScopedWriteLock to SCOPED_WRITE_LOCK (same for read locks)
* Fixed some missing cleanup of trace view when starting a new trace view in visualizer

[CL 32137083 by henrik karlsson in ue5-main branch]
2024-03-08 18:31:48 -05:00
henrik karlsson
692b672f93 [UBA]
* Fixed bug where additional work could be added without worker picking it up causing a deadlock

[CL 32022334 by henrik karlsson in ue5-main branch]
2024-03-05 11:32:25 -05:00
henrik karlsson
bbf8e06ccb [UBA]
* Added support for handling server side messages and send response later
* Changed so storage proxy errors are muted when disconnected
* Improved error messages

[CL 31981923 by henrik karlsson in ue5-main branch]
2024-03-03 21:30:10 -05:00
henrik karlsson
ffbd2809a5 [UBA]
* Fixed race condition in UbaNetworkClient related to message ids and returning message ids after use
* Implemented network backend that is using memory for communication (will be used between client and proxy server when running inproc)
* Added connection uid provided by network backend to be able to improve error messages
* Added NetworkBackend GetTotalSendAndRecv that can be used to fetch all traffic on backend

[CL 31981782 by henrik karlsson in ue5-main branch]
2024-03-03 21:07:10 -05:00
henrik karlsson
0b3f5e6f95 [UBA]
* Extracted work tracker out from work manager so it can be used for multiple work managers when running with proxies

[CL 31981685 by henrik karlsson in ue5-main branch]
2024-03-03 20:51:45 -05:00
henrik karlsson
79f24ee323 [UBA]
* Changed so sending in a 0 workercount will use logical core count
* Cleaned up code around work tracking

[CL 31796079 by henrik karlsson in ue5-main branch]
2024-02-26 01:35:40 -05:00
henrik karlsson
488b4334ab [UBA]
* Elevate thread priority for the threads doing all the networking to make sure proxy and host are prioritizing receiving/sending network data

[CL 31796023 by henrik karlsson in ue5-main branch]
2024-02-26 01:29:10 -05:00
henrik karlsson
0b12758059 [UBA]
* Changed so all ScopedReadLock and ScopedWriteLock are macros (SCOPED_READ_LOCK and SCOPED_WRITE_LOCK).. this to be able to add code around the lock to measure contention. Contention testing code can be enabled by setting UBA_TRACK_CONTENTION to 1

[CL 31795889 by henrik karlsson in ue5-main branch]
2024-02-26 01:14:42 -05:00
henrik karlsson
18005e4601 [UBA]
* Disabled custom signal handler in UbaHost in case this is the reason we get crashes in ubt
* Changed some logging from Info to Detail

[CL 31742437 by henrik karlsson in ue5-main branch]
2024-02-22 18:46:44 -05:00
henrik karlsson
08546e2d22 [Uba]
* NetworkBackendTcp - Changed so when having failed connection we wait for socket thread before taking lock to delete it.
* NetworkBackendTcp - Removed check for 0 connections. That was a hack and now all code should properly wait
* Changed infinte wait of event to 60 seconds and output error if it times out..
* Removed NetworkBackend::Close. This should never be done on the outside. Outside should always call shutdown and then wait for disconnect callback
* Moved shutdown of socket to outside lock to prevent deadlock

[CL 31731469 by henrik karlsson in ue5-main branch]
2024-02-22 15:02:42 -05:00
henrik karlsson
eb7f4122df [Uba]
* Fixed shutdown issues where client and server did not wait for connection to be closed before closing down
* Added error handling for trying to decompress bad cas files

[CL 31715626 by henrik karlsson in ue5-main branch]
2024-02-22 01:56:35 -05:00
henrik karlsson
af9dbfb4ea [Uba]
* Removed line that cleared m_connections in UbaNetworkServer. That should not be there

[CL 31710487 by henrik karlsson in ue5-main branch]
2024-02-21 20:42:33 -05:00
henrik karlsson
8255fb00b5 [Uba]
* Added more description when failing to drop cas db entry
* Added cas entry lock around code that copy or link file out
* Added more fixes for things reported by tsan/asan

[CL 31708309 by henrik karlsson in ue5-main branch]
2024-02-21 20:09:27 -05:00
henrik karlsson
f0e8aebe39 [Uba]
* Fixed a UbaAgent shutdown hang that can happen with certain timings where workers go from used to non-used at the same time as they are being stopped

[CL 31698187 by henrik karlsson in ue5-main branch]
2024-02-21 16:49:26 -05:00
henrik karlsson
fff9d32f8d [UBA]
* Added support for workers being able to push current work and pick up new work

[CL 31607406 by henrik karlsson in ue5-main branch]
2024-02-19 02:01:21 -05:00
henrik karlsson
141b357011 [Uba]
* Added code to validate that connections from server to client are all from the same server. Solved by giving servers a unique id and made sure it is part of connection handshake with client. Client will just discard connections that is providing a different server uid than the first connection

This will hopefully fix the bug seen on our farm a couple days ago. Theory is that two servers managed to connect to the same helper. First server is just about to start connecting to a helper when helper is brought down and restarted to help another server. Some sort of stall happens on first server and once it starts the actual tcp connection the helper has been brought up to help the other server. The helper allows more than one tcp connection for performance reasons so will accept tcp connection from both old and new server. In this messed up scenario some messages would go to one server and some to the other.. causing all kinds of weird things. It is critical that all connections go to the same server since all messages are just round robin the tcp connections.

[CL 31460950 by henrik karlsson in ue5-main branch]
2024-02-14 01:23:19 -05:00
henrik karlsson
cd8350a752 [Uba]
* Added so if NetworkServer fails to stop connections after x seconds because of running workers we abort the entire process. I'm hoping we will get

[CL 30399360 by henrik karlsson in ue5-main branch]
2023-12-19 13:29:44 -05:00