Commit Graph

53 Commits

Author SHA1 Message Date
henrik karlsson
5dc54a8104 [UBA]
* Huge stability push... all (known) paths have been tested heavily on linux with tsan and every single found race condition report has been fixed. Lots of locks have been added/moved/changed and some instances of things have been leaked on purpose to prevent tsan reports during shutdown
* More efficient storage proxy implementation which immediately forward segments to clients once they are available in proxy
* Added UbaAgent -command=x which can be used to send commands to host. Supported commands are "status"which prints out status of all remote sessions. "abort/abortproxy/abortnonproxy" that can be used to abort remote sessions and "disableremote" to have host stop accepting more helpers
* Fixed scheduler::stop bug if remotes were still requesting processes
* Added support for process reuse on linux/macos
* Added support for Coordinator interface and dynamically load coordinator dll in UbaCli
* Restructured code a little bit to be able to queue up all writes in parallel
* Added Show create/write colors to visualizer (defaults to on)
* Fixed so write file times are visualized in visualizer
* Improved socket path for visualizer
* Improved a lot of error messages
* Fixed double close of memory handle in StorageServer
* Changed some ScopedWriteLock to SCOPED_WRITE_LOCK (same for read locks)
* Fixed some missing cleanup of trace view when starting a new trace view in visualizer

[CL 32137083 by henrik karlsson in ue5-main branch]
2024-03-08 18:31:48 -05:00
henrik karlsson
0b12758059 [UBA]
* Changed so all ScopedReadLock and ScopedWriteLock are macros (SCOPED_READ_LOCK and SCOPED_WRITE_LOCK).. this to be able to add code around the lock to measure contention. Contention testing code can be enabled by setting UBA_TRACK_CONTENTION to 1

[CL 31795889 by henrik karlsson in ue5-main branch]
2024-02-26 01:14:42 -05:00
henrik karlsson
4373e37640 [Uba]
* Fixed a bug in UbaWorkManager shutdown where intrusive linked list contains bad instances while deleting workers
* Fixed more tsan errors when doing ctrl-c on host

[CL 31676297 by henrik karlsson in ue5-main branch]
2024-02-21 02:52:38 -05:00
henrik karlsson
5adb64e10e [UBA]
* Fixed read-after-free bug found by TSAN in StorageServer.cpp
* Fixed write-after-free bug found in UbaProcess.cpp (m_cancelEvent.Set() could be called after memory was freed)
* Lots of minor fixes found using TSAN. Most are harmless but still nice to cleanup
* Disabled mimalloc for linux again.. seems like tsan does not like it so maybe there are bugs in it

[CL 31676014 by henrik karlsson in ue5-main branch]
2024-02-21 01:43:26 -05:00
henrik karlsson
153bc16e50 [Uba]
* Added O_CLOEXEC to all open to prevent file descriptors from leaking to child processes. This seems to really fix the ETXTBUSY
* Added read hint for DecompressMemoryToMemory to be able to show better error messages

[CL 31636371 by henrik karlsson in ue5-main branch]
2024-02-19 20:43:29 -05:00
henrik karlsson
d4bac982d7 [UBA]
* Added so session information contains information about number of processor groups
* Added so detoured process matches uba message thread's processor group

[CL 31607498 by henrik karlsson in ue5-main branch]
2024-02-19 02:08:44 -05:00
henrik karlsson
eda007f51f [UBA]
* Added support for most features in UbaScheduler. Dependencies, weights etc etc.
* Added support for loading yaml file with processes for UbaScheduler.
* Added so UbaCli interprets yaml file as a file with processes use it to populate a scheduler which is then executed
* Fixed so all threads inside same process spawned by uba ends up in the same thread group.
* Fixed linux crash where process comunication memory was deleted when cancel event was called (added lock around code)
* Fixed deadlock that could happen if flush dead processes were called at the after lock but before processhandle dtor in Session::ProcessExited
* Changed new[]/delete[] to aligned_alloc/free because for some reason new/delete trigger asan on linux and don't know why.

[CL 31372220 by henrik karlsson in ue5-main branch]
2024-02-11 04:00:51 -05:00
henrik karlsson
c5b06cbba0 [UBA]
* Added more information for when process dies without sending exit message

[CL 31367591 by henrik karlsson in ue5-main branch]
2024-02-10 12:38:52 -05:00
henrik karlsson
cca66a4ea3 [UBA]
* Removed logging entries that was only added to make horde not timeout.. since this was implemented horde has added ping logic so this is not needed anymore
* Fixed bug in callstack logging on posix
* Changed a bunch of %hs to %s
* Added allocator debug code to try to figure out asan issues

[CL 31281576 by henrik karlsson in ue5-main branch]
2024-02-07 20:36:18 -05:00
sergio gardeazabal
9728fc156b [UBA] Disable power throttling in created processes to ensure P-Cores are preferred over E-Cores on Intel Hybrid architectures platforms
#jira UE-205966

#rb henrik.karlsson

[CL 31193922 by sergio gardeazabal in ue5-main branch]
2024-02-05 17:56:36 -05:00
henrik karlsson
eee33ba668 [UBA]
* Fixed so stdout/err is redirected to pipe on detoured processes on linux/mac.. this should fix so errors are visualized in the right place
* Added -nostdout to UbaCli to be able to check that redirected stdout/err works
* Added version for process messages to catch issues where ppl have old ubaagent.exe but new ubadetours.dll
* Added capacity parameter to FixPath and added asserts to check that we never write outside capacity
* Fixed code creating g_virtualApplication buffer.
* Disabled asserts for mac non debug builds
* Removed detoured write/fwrite for posix now when we instead redirect stdout/err
* Enabled some unit tests for linux now when they work as intended regarding redirected stdout/err

[CL 31029918 by henrik karlsson in ue5-main branch]
2024-01-30 19:05:51 -05:00
henrik karlsson
0eaf4b8111 [UBA]
* Fixed some error handling for posix
* Improved some logging
* Disabled some unit tests on linux because they always fail

[CL 30991796 by henrik karlsson in ue5-main branch]
2024-01-30 02:17:34 -05:00
henrik karlsson
3ea6ef91db [UBA]
* Removed error on linux for failing poll call

[CL 30987771 by henrik karlsson in ue5-main branch]
2024-01-29 23:59:19 -05:00
zack neyland
505cac5c83 UBA: Add if some defs around POLLUP for platform differences
[CL 30983584 by zack neyland in ue5-main branch]
2024-01-29 21:01:39 -05:00
zack neyland
59077f46f4 UBA: Close fds when done with polling to stop leaking on MacOS.
[CL 30980705 by zack neyland in ue5-main branch]
2024-01-29 18:45:46 -05:00
henrik karlsson
0d51a68c53 [Uba]
* Fixed file descriptor leak when running native processes on posix platforms
* Removed hangup test on poll for pipe file descriptors since it exited to early and missed reads

[CL 30963156 by henrik karlsson in ue5-main branch]
2024-01-29 02:13:15 -05:00
josh adams
2c6d16634e Fixing working dir on Mac
#rb henrik.karlsson

[CL 30891145 by josh adams in ue5-main branch]
2024-01-25 12:31:40 -05:00
zack neyland
c10257601e UBA: Bump cmdline limit to 64k
[CL 30888128 by zack neyland in ue5-main branch]
2024-01-25 11:10:13 -05:00
henrik karlsson
def11cc5d0 [UBA]
* Improved error handling for setpriority even more. There is a risk that a non-detoured process have exited already when we get to this function

[CL 30878827 by henrik karlsson in ue5-main branch]
2024-01-25 01:12:09 -05:00
henrik karlsson
29914bf077 [UBA]
* Fixed so if setpriority fails on linux because of not having permissions we don't fail the build

[CL 30878645 by henrik karlsson in ue5-main branch]
2024-01-25 01:03:44 -05:00
henrik karlsson
abae032917 [UBA]
* Improved error description when failing to create pipe on posix

[CL 30875937 by henrik karlsson in ue5-main branch]
2024-01-24 22:22:28 -05:00
henrik karlsson
659c4c9101 [UBA]
* Changed so com memory is not split up in read and write since they never overlap.. so both read and write can use all the memory

[CL 30871523 by henrik karlsson in ue5-main branch]
2024-01-24 20:33:10 -05:00
henrik karlsson
ade0725e14 [UBA]
* Fixed so imagehlp.dll and dbghelp.dll are detoured when loaded. Detour ImageGetDigestStream and SymLoadModuleExW because both of them cause trouble on wine
* Fixed so reuse of processes also honor log files so a new log file is created when reuse happen
* Improved log file naming so host can set name and log files are sent back properly with the right name

[CL 30638193 by henrik karlsson in ue5-main branch]
2024-01-16 13:24:28 -05:00
henrik karlsson
4b1c00af7d [UBA]
* Fixed stats reporting bug for process reuse

[CL 30588189 by henrik karlsson in ue5-main branch]
2024-01-12 01:51:57 -05:00
henrik karlsson
22cf61f84b [UBA]
* Added GetNextProcess as a real message type (not using Custom) to be able to stop fetching work if helper is being terminated by aws or if we want to scale down number of workers
* Did some cleanup in the Tcp backend code and made sure not to call close socket multiple times (since it can cause the code to close a different socket than what is owned)

[CL 30584707 by henrik karlsson in ue5-main branch]
2024-01-11 21:01:56 -05:00