Commit Graph

165 Commits

Author SHA1 Message Date
andriy tylychko
9da6a65dfe Mac compilation: added "-faligned-new" if an old macOS is targetted to avoid "aligned new is not supported" error. Broken build: https://horde.devtools.epicgames.com/job/618160c0fc786a000148b7f4?step=ea45
preflight https://horde.devtools.epicgames.com/job/6182954f300d520001e9e1fb?step=1df5

#jira UE-131480
#rb will.damon

#ROBOMERGE-AUTHOR: andriy.tylychko
#ROBOMERGE-SOURCE: CL 18049627 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v885-17909292)

[CL 18049637 by andriy tylychko in ue5-release-engine-test branch]
2021-11-04 05:05:04 -04:00
chris varnsverry
54783f2790 - Fix aligned allocation reliated compile fails in targets building against old OSX versions.
[at]Michael.Kirzinger [at]Sam.Zamani
#jira UE-133052

#ROBOMERGE-AUTHOR: chris.varnsverry
#ROBOMERGE-SOURCE: CL 18022672 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v885-17909292)

[CL 18022733 by chris varnsverry in ue5-release-engine-test branch]
2021-11-02 14:22:11 -04:00
andriy tylychko
6bf3101dcd deprecated FTicker and family and replaced by thread-safe FTSTicker
#jira UE-120090
#rb francis.hurteau


#ROBOMERGE-SOURCE: CL 17176325 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v855-17104924)

[CL 17176374 by andriy tylychko in ue5-release-engine-test branch]
2021-08-16 11:09:22 -04:00
patrick laflamme
dc2f183374 Fixed hang in crash reporter client when reporting bug in unattended mode.
- Ensured the new FTSTicker get ticked when running CRC in unattended mode.

#rb Jerome.Delattre
[FYI] Dmytro.vovk

#ROBOMERGE-SOURCE: CL 17157164 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v855-17104924)

[CL 17157202 by patrick laflamme in ue5-release-engine-test branch]
2021-08-12 15:34:48 -04:00
patrick laflamme
e35961b2ad Removed CrashReportClient analytic field 'MonitorQueryingPipe' that was temporary added to verify if CRC crashed while reading the pipe.
- The data show no evidence that CRC is crashing there. Capturing this state is I/O expensive and not required moving forward.

#jira UETOOL-4042 Inspect UE5/Main analytics for CRC crashes
#rb Jamie.Dale

#ROBOMERGE-SOURCE: CL 17116844 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v855-17104924)

[CL 17116855 by patrick laflamme in ue5-release-engine-test branch]
2021-08-10 10:58:16 -04:00
aurel cordonnier
02c0f425e8 Copy up from Release-Engine-Staging @ 16738359
This represents UE4/Main @ 16738161 and Dev-PerfTest @ 16737719

[CL 16738582 by aurel cordonnier in ue5-release-engine-test branch]
2021-06-22 00:27:54 -04:00
brandon schaefer
a90cdbe7c2 Rename LinuxAArch64 to LinuxArm64
#jira UE-118127
#rb Michael.Sartain
[FYI] Marc.Audy, Aurel.Cordonnier

#ROBOMERGE-SOURCE: CL 16660821 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v834-16658389)

[CL 16660830 by brandon schaefer in ue5-release-engine-test branch]
2021-06-14 13:40:06 -04:00
patrick laflamme
79392843da Added temporary diagnostic code to CrashReportClient in hopt to narrow down why it suspiciouly die often.
#rb Jamie.Dale

#ROBOMERGE-SOURCE: CL 16640468 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v831-16623017)

[CL 16640473 by patrick laflamme in ue5-release-engine-test branch]
2021-06-11 08:51:56 -04:00
david harvey
5467ad8d2c adding the option to hide the 'submit and restart' crash reporter option for platforms that do not support it.
#jira UE-93432
#rnx
#rb Patrick.Laflamme

#ROBOMERGE-SOURCE: CL 16622682 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v830-16605563)

[CL 16622684 by david harvey in ue5-release-engine-test branch]
2021-06-10 09:59:46 -04:00
patrick laflamme
f8bc59a1c3 Updated the crash report analytic session summary version number from 4 to 5.
#rb trivial

#ROBOMERGE-SOURCE: CL 16592549 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v828-16531559)

[CL 16592557 by patrick laflamme in ue5-release-engine-test branch]
2021-06-08 16:11:35 -04:00
aurel cordonnier
25a11deeac Merge from Release-Engine-Staging @ 16579919
This represents UE4/Main @ 16579691 and Dev-PerfTest @ 16579576

[CL 16581170 by aurel cordonnier in ue5-release-engine-test branch]
2021-06-07 20:09:45 -04:00
Patrick Laflamme
571c8ffe14 Added analytics to report more granular metrics about CRC performance when handling crash/ensure/stall.
#rb Jamie.Dale

[CL 16570467 by Patrick Laflamme in ue5-main branch]
2021-06-07 10:37:55 -04:00
Patrick Laflamme
aa76047110 Stripped the callstack from the ensure error message when logging the ensure message in the analytics diagnostic log.
- This information is not required for analytic purpose.

#rb Jamie.Dale

[CL 16533474 by Patrick Laflamme in ue5-main branch]
2021-06-02 09:16:50 -04:00
Patrick Laflamme
ef94e39d6c UETOOL-3650 - Delete expired UECrashContext-pid.xml that could be left over by crashed/killed CRC
- Added code to run a clean up on UECrashContext-{pid}.xml files that 30 days old where the process ID (pid in the name) is not running anymore.

#rb Jamie.Dale

[CL 16522984 by Patrick Laflamme in ue5-main branch]
2021-06-01 16:55:30 -04:00
aurel cordonnier
43fa62fcd8 Merge from Release-Engine-Test @ 16487383 to UE5/Main
This represents UE4/Main @ 16445039 and Dev-PerfTest @ 16444526

[CL 16488106 by aurel cordonnier in ue5-main branch]
2021-05-27 13:40:37 -04:00
Patrick Laflamme
9925c861ae Prevent the engine from providing all threads to CRC on stall and ensure. Only transmit the responsible thread, so that CRC doesn't need to walk all threads before resuming the engine.
- Analytics shows that CRC may takes up to 400 seconds to walk all the threads and create a minidump before responding back to the engine on the pipe to resume the crashing thread.

#rb Johan.Berg
#fyi Geoff.Evans

[CL 16483781 by Patrick Laflamme in ue5-main branch]
2021-05-27 10:10:12 -04:00
danny couture
c5710a3eba Fix non-unity build
#rb trivial
#rnx

[CL 16469067 by danny couture in ue5-main branch]
2021-05-26 10:18:35 -04:00
Patrick Laflamme
9f3f8f25e8 Removed duplicated module dependency on Analytics from CrashReportCLientEditor build script.
Added missing include in AnalyticsPropertyStore.cpp
Added ENGINE_API to expose EngineAnlytixSessionSummary publicly.
Added the longuest unattended crash report processing measured during a session to CRC summary to track how fast or how slow CRC is to manage ensures/stalls.

#rb Jamie.Dale

[CL 16404437 by Patrick Laflamme in ue5-main branch]
2021-05-20 11:39:14 -04:00
Patrick Laflamme
cb3aaf9000 Fixed analytics summary system losing type information.
- Instead of converting summary properties to string duing session aggregation, put properties in FAnalyticsEventAttribute instances to preserve the type information.
Added MonitorEngineVersion, MonitorReportCount, MonitorEnsureCount, MonitorAssertCount to CRC analytics summary.
Added MissingDataFrom to the analytics summary when the property file from a collaboration process failed to load.
Fixed the analytics summary manager to aggreage and produce a report even if the helper process data couldn't be loaded.

#rb Jamie.Dale

[CL 16382795 by Patrick Laflamme in ue5-main branch]
2021-05-19 06:22:37 -04:00
Patrick Laflamme
d62b1c4fea Fixed Editor analytics session summary key 'MonitorLogs' as 'MonitorLog' to be consistent with previous versions.
#rb Trivial.

[CL 16328917 by Patrick Laflamme in ue5-main branch]
2021-05-14 09:18:54 -04:00
Jeff Farris
c2c4f4ac98 Fixed CrashReportClientEditor compile error in unity builds.
#fyi Patrick.Laflamme

[CL 16326683 by Jeff Farris in ue5-main branch]
2021-05-14 00:48:14 -04:00
Patrick Laflamme
2e5316e1ca Generalized the Editor analytics summary session system to be usable/extendable by other apps.
Engine/Editor changes:

- Split the Editor summary session in two, one summary for the Engine properties and one for the Editor specific properties. Made it easy to extend the Engine summary to create other summaries.
- Made the summary sender as agnostics as possible of the keys it sends.
- Fixed the system wide lock contention between the process when persisting a session. (On problem caused by the lock is UE-114315).
- Fixed concurrent issue when saving the summary sessions on Linux/Mac
- Fixed performance issue when saving the summary session on Linux/Mac. This enable saving at higher frequency.
- Fixed cases where the same session summary is sent more than once.
- Fixed Windows registry key overflow that could happens if we accumulated too many sessions (in theory, this can happen)
- Made adding new properties to the summary easy and private to the implementation.
- Brought the Linux/Mac implementation closer to Windows implementation.
- Reduced memory allocation, especially when the session records a crash.
- Improved chances to send the summary non-delayed by allowing the Editor to send the reports if CRC died unexpectedly.
- Generalized the support to collect and aggregate analytics from helper processes. For example, CRC already collects analytics that is merged with the Editor summary as information supplement
- Reserved the disk space required to store the summary ahead of time to prevent failing later.
- Increased frequency at which the summary is persisted because saving the summary is more efficient. (About every 10 seconds rather than every minutes).
- Added unit tests

CrashReportClient changes:

- Created a 'session summary' from the CRC point of view to merge with the Editor summary.
- Moved analytics collection in a separated class to make the crash reporting code leaner and less noisy with all the analytics
- Merged the CRC diagnostic logger in the class collecting CRC analytics summary and make the diagnostic log a property in the summary.
- Collected analytics (on behalf of Editor) in a background thread because CRC main thread can be blocked collecting a crash, so it doesn't pay attention to other things
- Added MonitorBatteryLevel and MonitorOnACPower summary properties on Windows. Collected on CRC background thread (never blocked, so we reduce changes to miss the battery running out)
- Added MonitorSessionDuration summary property to track now long CRC ran.
- Added MonitorQuitSignalRecv summary property to detect when CRC is soft killed like: taskkill /PID 1234
- Added MonitorIsReportingCrash summary property to track when CRC dies reporting a crash.
- Added MonitorIsCollectingCrash summary property to track when CRC dies collecting a crash artifacts.
- Added IsProcessingCrash summary property to track when CRC dies processing a crash.
- Added MonitorCrashed summary property to track when CRC exception handler was triggered.
- Added MonitorWasShutdown summary property to track when CRC summary was shutdown
- Added MonitorLoggingOut summary property to track when CRC died because the user was logging out (or as result of shutting down or restarting the computer).
- More accurate value for DeathTimestamp summary property because this is now captured in CRC background thread (which cannot be busy handling a crash)
- Added crash processing timing to CRC diagnostic logs (how long it takes to collect/process a crash).

#rb Jamie.Dale, Wes.Hunt, Johan.Berg
#jira UETOOL-3500
#jira UE-114315

[CL 16324612 by Patrick Laflamme in ue5-main branch]
2021-05-13 21:58:20 -04:00
aurel cordonnier
50944fd712 Merge UE5/RES @ 16162155 to UE5/Main
This represents UE4/Main @ 16130047 and Dev-PerfTest @ 16126156

[CL 16163576 by aurel cordonnier in ue5-main branch]
2021-04-29 19:32:06 -04:00
Patrick Laflamme
d6a9f2f2e9 Fixed missing PCallstack happening when the Editor has more than 256 threads and the crashing thread is not in the 256 first visited by the OS.
- Bumped the limit from 256 to 512
  - Always reserve one spot for the crashing thread in the list transmitted to CRC, possibly ignoring some thread.
  - Added diagnostic logs in CRC to captures cases where the number of thread would reach the new limit of 512 or if the crashing thread is 0.

#jira UE-114291 - Fail to capture some Editor PCallstack because a hard limit in GenericCrashContext
#rb Johan.Berg

[CL 16123400 by Patrick Laflamme in ue5-main branch]
2021-04-27 08:35:19 -04:00
Patrick Laflamme
296f501123 Added a diagnostic log to CRC when the handle returned by OpenProcess() is invalid. This handle is used to stack walk the crash and generate a minidump.
#rb trivial

[CL 16020816 by Patrick Laflamme in ue5-main branch]
2021-04-15 10:01:55 -04:00