Commit Graph

43 Commits

Author SHA1 Message Date
Patrick Laflamme
3aa8c48447 Fixed DisasterRecovery asserting if the recovery session database could not be open or was corrupted.
- Changed the recovery session activity stream to return 'I'm done' when it encounters an error to prevent the caller to ignore the error and continue asking new activities (which triggers the assert).
  - Handled the error reported by the stream when searching for a recoverable session just after the Editor finished its initialization.

Fixed DisasterRecovery freezing the Editor if the new session database could not be created successfully.
  - Reported the error to the client instead of silently ignoring it.
  - Updated the verbose log to Warning to let the user know that a new session could not be created (and therefore disaster recovery is going to be off.

#jira UE-89128 - Re-Opening Game After Trying to Add a Cooked Material to a Level Results in a Crash
#rb Jamie.Dale

Details:
  - The bug UE-89128 is possibly related to UE-89826 (at least on Mac) where the database file would fails to open successfully. Corrupted database/broken file API would fail to open the database file and will result in the assert below:
        Assertion failed: (RequestFetchCount < 0 && LowestFetchedActivityId == 0) || (RequestFetchCount > 0 && LowestFetchedActivityId > 1) [File:/Users/build/Build/++UE4+Licensee/Sync/Engine/Plugins/Developer/Concert/ConcertSync/ConcertSyncClient/Source/ConcertSyncClient/Private/ConcertActivityStream.cpp] [Line: 66]

This CL fix can be tested by:
      1. Launch the Editor, generate a crash with 'debug crash' command.
      2. Comment 'SessionDatabase->Open(Session->GetSessionWorkingDirectory());' in Engine\Plugins\Developer\Concert\ConcertSync\ConcertSyncServer\Source\ConcertSyncServer\Private\ConcertSyncServerLiveSession.cpp to simulate SQLite failing to open the database.
      3. Recompile CrashReportClientEditor(Windows) or UnrealRecoverySvc(Mac/Linux)
      4. Relaunch the editor.

[CL 12393668 by Patrick Laflamme in 4.25 branch]
2020-03-24 17:03:17 -04:00
Francis Hurteau
b39d0af6e3 Expose a Server Port settings for the Multi-User server unicast endpoint over the passed UDP messaging settings when booting the server from the Editor
#jira UE-91000
#rb Jamie.Dale

[CL 12381903 by Francis Hurteau in 4.25 branch]
2020-03-23 15:15:04 -04:00
Patrick Laflamme
d9afcc0448 Improved scalability of Disaster Recovery
- Converted Concert API transferring package data in-memory only model to a streaming model to support packages bigger than 2 GB. (TNumberiLimit<int32>::max())
  - Added the IConcertFileSharing interface to share large files between the client and the server. This is used as a side channel to the Concert request/response and event protocol.
  - Fixed the ConcertClientPackageManager to prevent sending the package data for each the 'pre-save' when the 'live sync'  is off. It only emits it once.
  - Fixed UI to correctly report pre-save vs save vs auto-save for package as well as when a package is discarded.

#jira UE-85652 - Crash when importing large FBX with Morph Targets and Disaster Recovery enabled
#jira UE-78722 - Potential Memory Leak with Disaster Recovery Plugin

#rb Francis.Hurteau, Jamie.Dale

[CL 12113821 by Patrick Laflamme in 4.25 branch]
2020-03-10 14:24:53 -04:00
Patrick Laflamme
7f03785783 #jira UE-87927 - Disaster Recovery doesn't restore a crash from a restored session
- Added the ability to copy and restore a live session, preventing the need to archive it in first place, making the server exist fast (releasing the session lock very quickly) before showing the crash UI and before the next Editor instance could starts.

Details:

This bug could manifest if various ways. An issue causing this bug was fixed in 11252374. This bug can also be observed if the crash reporting process doesn't release its lock on the crashed session quickly. Archiving a session may takes several minutes (depending on the session size) and while a session is archiving, its database is locked and cannot be restored until the archiving process complets. When the Editor reboots after a crash, it searches for a session to recover, but skip over any session that is mounted/locked assuming the session is concurrently used by a concurrent Editor process, potentially preventing it from restoring. The optimal way to work around this problem  is to skip the archiving step. Instead, the live session is never archived (saving a copy), which allows the recovery service to shutdown and release the session lock very quickly ensuring that the session will be unlocked when the Editor restarts. On Editor start, it a crashed session is found and the user decides to restore it, the live session is copied into a new live session.

This changelist also affect those other jira in the following ways:

#jira UE-87899 - Disaster recovery prevents showing the crash reporting UI in a timely manner if the session is large
  - This CL changes execution order to shut down the recovery service ASAP to release the lock, but the optimization above make it super fast, so the UI should always be shown in a timely manner.

#jira UE-87927 - Disaster Recovery doesn't restore a crash from a restored session
  - This CL ensures the recovery service release the session lock faster than the next instance of the Editor can start.

#jira UE-87900 - Disaster Recovery stops recording transactions if the UDP transport layer restarts or auto-repair
#jira UE-88517 - Concert Log Spam - (ConcertKeepAlive) discarded
  - This CL fixes an issues with endpoints timeout logic.

#jira UE-81049 - Clean up the DisasterRecovery Intermediate directory
  - This CL added code to clean up the intermediate directory left over by crashed client.

#rb Francis.Hurteau

[CL 11632069 by Patrick Laflamme in 4.25 branch]
2020-02-26 11:17:23 -05:00
Francis Hurteau
5e20adb170 Modified Multi-User session connection text
#rb trivial
#jira UE-89064

[CL 11565339 by Francis Hurteau in 4.25 branch]
2020-02-20 10:16:05 -05:00
Patrick Laflamme
75b36ad299 #jira UE-85967 - UnrealDisasterRecoveryService character length is long and can easily max out path length
#jira UE-88070 - UnrealDisasterRecoveryService paths are too long
  - Renamed UnrealDisasterRecoveryService as UnrealRecoverySvc
  - Set a ConcertSyncServer ShortName as "CncrtSyncSvr" to ensure shorter build path.

The change saves 29 characters on the offending path. The path before vs the path after:

Engine\Plugins\Developer\Concert\ConcertSync\ConcertSyncServer\Intermediate\Build\Win64\UnrealDisasterRecoveryService\Development\ConcertSyncServer\UnrealDisasterRecoveryService-ConcertSyncServer.lib (Before, 199 chars)
Engine\Plugins\Developer\Concert\ConcertSync\ConcertSyncServer\Intermediate\Build\Win64\UnrealRecoverySvc\Development\CncrtSyncSvr\UnrealRecoverySvc-ConcertSyncServer.lib (After, 170 chars)

#rb Jamie.Dale

Edigrated 11281991 from Dev-VirtualProduction

[CL 11516806 by Patrick Laflamme in 4.25 branch]
2020-02-18 16:17:00 -05:00
Patrick Laflamme
276bd1b2b3 #jira UE-87927 - Disaster Recovery doesn't restore a crash from a restored session
- Fixed concert server to restore a session (by creating a new one) in the 'default repository' as expected rather than the creating it in the repository containing the archived session.

#rb Jamie.Dale

Edigrated 11252374 from Dev-VirtualProduction.

[CL 11515813 by Patrick Laflamme in 4.25 branch]
2020-02-18 15:52:56 -05:00
Patrick Laflamme
038532623e #jira UE-88517 - Concert Log Spam - (ConcertKeepAlive) discarded.
- Defaulted the Concert debug log level to 'Error' to avoid spamming the log when a remote endpoint disconnect.

#rb Jamie.Dale

Edigrated from 11445411.

[CL 11510926 by Patrick Laflamme in 4.25 branch]
2020-02-18 14:46:14 -05:00
Josh Adams
aa9705149b Copying Private-LoadTimes-4.24 stream to Main. Biggest changes are in Materials/Shader memory freezing.
#rb none

[CL 11282608 by Josh Adams in Main branch]
2020-02-06 13:13:41 -05:00
Patrick Boutot
b67ff68e04 Copying //UE4/Dev-VirtualProduction to //UE4/Dev-Tools-Staging @ 11168401
#rb none
#rnx

[CL 11170710 by Patrick Boutot in Dev-Tools-Staging branch]
2020-01-29 18:45:15 -05:00
Francis Hurteau
9087a7b668 added a settings for different connection validation mode allowing for prompt on connection instead of hard failure
Source Control validation now check if checked out files are actually modified instead of just being checked out, if validation mode is soft, Multi-User connection are allowed to proceed
Prompt on connection when dirty packages are found on soft validation.
Hot reload dirty packages on connection

#rb Jamie.Dale, Patrick.Laflamme
#jira UE-83300, UE-83302, UE-83303

[CL 11052914 by Francis Hurteau in Dev-VirtualProduction branch]
2020-01-17 14:05:35 -05:00
Patrick Boutot
410c720ac7 Merging //UE4/Dev-Main @ 10886849 to Dev-Tools-Staging (//UE4/Dev-Tools-Staging)
#rb none
#rnx
#author jeanmichel.dignard

[CL 10992634 by Patrick Boutot in Dev-VirtualProduction branch]
2020-01-15 09:39:21 -05:00
Patrick Boutot
e488747140 Copying //UE4/Dev-Core [at] 10708550 to Dev-Main (//UE4/Dev-Main)
#rb none

#ROBOMERGE-OWNER: patrick.boutot
#ROBOMERGE-AUTHOR: robert.manuszewski
#ROBOMERGE-SOURCE: CL 10708666 in //UE4/Main/... via CL 10898071
#ROBOMERGE-BOT: TOOLS (Dev-Tools-Staging -> Dev-VirtualProduction) (v632-10940481)

[CL 10942172 by Patrick Boutot in Dev-VirtualProduction branch]
2020-01-10 11:54:32 -05:00
JeanMichel Dignard
70d074639f Merging //UE4/Dev-Main @ 10886849 to Dev-Tools-Staging (//UE4/Dev-Tools-Staging)
#rb none
#rnx

[CL 10906274 by JeanMichel Dignard in Dev-Tools-Staging branch]
2020-01-08 13:26:18 -05:00
jeanmichel dignard
2ce7666d2d Copying //UE4/Dev-Core [at] 10708550 to Dev-Main (//UE4/Dev-Main)
#rb none

#ROBOMERGE-OWNER: jeanmichel.dignard
#ROBOMERGE-AUTHOR: robert.manuszewski
#ROBOMERGE-SOURCE: CL 10708666 in //UE4/Main/...
#ROBOMERGE-BOT: TOOLS (Main -> Dev-Tools-Staging) (v626-10872990)

[CL 10898071 by jeanmichel dignard in Dev-Tools-Staging branch]
2020-01-07 15:54:23 -05:00
Patrick Laflamme
0afeb9569d Fixed Concert server using a dangling reference when unmounting a repository.
Fixed Concert server to unregister left over request handler on shutdown.
Fixed Concert server handler to report a repository as mounted if the requesting client already mounted it in a previous request.
Updated concert server to clean up its repository database when a repository doesn't exist anymore on disk.

#rb Francis.Hurteau

[CL 10882584 by Patrick Laflamme in Dev-VirtualProduction branch]
2020-01-06 14:35:15 -05:00
Patrick Laflamme
57b56d8844 Fixed concert client that did not cancel in-flight connection tasks as expected if IConcertClient::DisconnectSession() was called before the session was created. For example if the CreateSession() and DisconnectSession() were called in a very short interval.
#rb Francis.Hurteau

[CL 10878258 by Patrick Laflamme in Dev-VirtualProduction branch]
2020-01-06 09:25:19 -05:00
Ryan Durand
28d3d740dd (Integrating from Dev-EngineMerge to Main)
Second batch of remaining Engine copyright updates.

#rnx
#rb none
#jira none

[CL 10871196 by Ryan Durand in Main branch]
2019-12-27 07:44:07 -05:00
Robert Manuszewski
7b6f840f7f Copying //UE4/Dev-Core @ 10708550 to Dev-Main (//UE4/Dev-Main)
#rb none

[CL 10708666 by Robert Manuszewski in Main branch]
2019-12-13 11:07:03 -05:00
Patrick Laflamme
b6c6bd4be0 #jira UE-82767 - Multi-User log spam in editor
- Fixed verbosity of Multi-User/DisasterRecover endpoint discovery by adding a new log category "LogConcertDebug" defaulting to "Warning" except for the Multi-User server which default to "Log" level. The category verbosity can be adjusted from the command line as: -LogCmds="LogConcertDebug Verbose" or from a console command as: log LogConcertDebug Verbose.
  - Prevented Disaster Recovery client from discovering (and logging) all recovery services running.

#rb Francis.Hurteau

[CL 10467794 by Patrick Laflamme in Dev-VirtualProduction branch]
2019-11-27 09:00:18 -05:00
patrick laflamme
c9ad27951c Fixed missing include file.
#jira UE-83339
#rb Trivial.
#rnx

#ROBOMERGE-SOURCE: CL 10266187 in //UE4/Release-4.24/...
#ROBOMERGE-BOT: RELEASE (Release-4.24 -> Main) (v591-10236483)

[CL 10266201 by patrick laflamme in Main branch]
2019-11-15 15:15:46 -05:00
patrick laflamme
93ab27052b #jira UE-83339 - Disaster Recovery can fail to recover its session when the project is opened from the Project Browser
- Fixed a disaster recovery bug preventing the Editor from recovering a session because another instance of the Editor on another project already locked all the sessions.

Problem:

On windows, the CrashReportClientEditor (hosting disaster recovery service) is started in the static initialization, before the engine is initialized, not allowing lot of command line configuration. The Editor project browser would start a first CrashReportClientEditor instance, which would load and lock all the available sessions (unless another CrashReportClientEditor was running). When the user selected a project, a new Editor and CrashReportClientEditor were launched before the first one was closed. The second instance could not access the existing sessions because they were still locked by the first instance.

Solution:

Because CrashReportClientEditor is launch before the engine is initialized, we don't have any context at the launch time. The best the was to delay the moment when the server reloads the existing sessions and enable each clients to store their sessions in different folders (repositories) mounted on demand by the server.

Implementation details:
  - Implemented new RPC API to allow the client to list/create/load/drop specific repositories containing its own sessions on demand.
  - Updated the Concert server to manage multiples directories where session can be stored/found (session repositories) rather than just one.
  - Added a settings to allow the user to specify where the disaster recovery sessions should be stored on the disk. Now default in the current project folder.
  - Added a settings to prevent the Concert server from scanning the sessions in the default location.
  - Updated disaster recovery to start without any session repository and let the client decide if a new one needs to be created or an existing one be mounted to restore a previous session.
  - Changed the code to let disaster recovery client manage its session history rather than letting the server rotate the old session. Defaulted the history to 0, user has no flow to visualize and pick from the history.

#rb Jamie.Dale

#ROBOMERGE-SOURCE: CL 10260823 in //UE4/Release-4.24/...
#ROBOMERGE-BOT: RELEASE (Release-4.24 -> Main) (v591-10236483)

[CL 10260830 by patrick laflamme in Main branch]
2019-11-15 12:55:57 -05:00
patrick laflamme
8058f82440 #jira UE-81549 - Disaster Recovery fails to find the session after selecting Send and Close or Send and Restart in Crash Reporter
- Cleared the concert server instance info on server shutdown.
  - Shutdown the disaster recovery service when a crash is created. This enable the next server instance to grab the file lock and restore.
  - Fixed archive rotation (delete oldest) that did not work when concurrent servers existed.
  - Improved disaster recovery error messages.
  - Fixed disaster recovery client not restoring a session that was crashed (server managed the crash), but for which the client process was still hanging around.
  - Prevent showing the recovery UI if -unattended is specified on command line.

#rb Jamie.Dale

#ROBOMERGE-SOURCE: CL 9617188 in //UE4/Release-4.24/...
#ROBOMERGE-BOT: RELEASE (Release-4.24 -> Main) (v528-9595928)

[CL 9617199 by patrick laflamme in Main branch]
2019-10-16 10:42:04 -04:00
Patrick Laflamme
6805fb63aa #jira UE-81549 - Disaster Recovery fails to find the session after selecting Send and Close or Send and Restart in Crash Reporter
- Cleared the concert server instance info on server shutdown.
  - Shutdown the disaster recovery service when a crash is created. This enable the next server instance to grab the file lock and restore.
  - Fixed archive rotation (delete oldest) that did not work when concurrent servers existed.
  - Improved disaster recovery error messages.
  - Fixed disaster recovery client not restoring a session that was crashed (server managed the crash), but for which the client process was still hanging around.
  - Prevent showing the recovery UI if -unattended is specified on command line.

#rb Jamie.Dale

[CL 9617188 by Patrick Laflamme in 4.24 branch]
2019-10-16 10:41:37 -04:00
Johan Torp
260681e90e CustomVersion thread-safety follow-up fix
#rb paul.chipchase
#jira UE-80965

[CL 9614528 by Johan Torp in Main branch]
2019-10-16 04:16:03 -04:00