#jira UE-80393
[FYI] jason.stasik
#ROBOMERGE-SOURCE: CL 8740502 via CL 8740567 via CL 8748495
#ROBOMERGE-BOT: (v422-8689730)
[CL 8748626 by patrick enfedaque in Main branch]
#jira UE-80393
#rb jason.stasik
[FYI] tim.gautier
#rnx
#ROBOMERGE-SOURCE: CL 8696823 via CL 8697751 via CL 8708001
#ROBOMERGE-BOT: (v422-8689730)
[CL 8709031 by patrick enfedaque in Main branch]
#rb none
#rnx
#ROBOMERGE-SOURCE: CL 8695819 via CL 8695827 via CL 8703359
#ROBOMERGE-BOT: (v422-8689730)
[CL 8703514 by patrick enfedaque in Main branch]
#jira UE-80400
#rb patrick.enfedaque
#ROBOMERGE-SOURCE: CL 8695613 via CL 8703310
#ROBOMERGE-BOT: (v422-8689730)
[CL 8703467 by sebastien lussier in Main branch]
#jira UE-80347
#rb richard.malo
#ROBOMERGE-SOURCE: CL 8690557 via CL 8700980
#ROBOMERGE-BOT: (v422-8689730)
[CL 8701194 by patrick enfedaque in Main branch]
#jira UE-75006
Fixed a few HLOD cluster generation options combinations
#ROBOMERGE-SOURCE: CL 8690507 via CL 8700951
#ROBOMERGE-BOT: (v422-8689730)
[CL 8701170 by sebastien lussier in Main branch]
#rb paul.chipchase
#jira UE-80025
#lockdown cristina.riveron
#ROBOMERGE-SOURCE: CL 8693005 in //UE4/Release-4.23/...
#ROBOMERGE-BOT: RELEASE (Release-4.23 -> Main) (v422-8689730)
[CL 8693043 by sebastian nordgren in Main branch]
#rb none
#rnx
#ROBOMERGE-OWNER: thomas.sarkanen
#ROBOMERGE-AUTHOR: braeden.shosa
#ROBOMERGE-SOURCE: CL 8681827 via CL 8688664
#ROBOMERGE-BOT: (v422-8689730)
[CL 8692986 by thomas sarkanen in Main branch]
Most UE4 platforms use a 2-byte TCHAR, however some still use a 4-byte TCHAR. The platforms that use a 4-byte TCHAR expect their string data to be UTF-32, however there are parts of UE4 that serialize FString data as a series of UCS2CHAR, simply narrowing or widening each TCHAR in turn. This can result in invalid or corrupted UTF-32 strings (either UTF-32 strings containing UTF-16 surrogates, or UTF-32 code points that have been truncated to 2-bytes), which leads to either odd behavior or crashes.
This change updates the parts of UE4 that process FString data as a series of 2-byte values to do so on the correct UTF-16 interpretation of the data, converting to/from UTF-32 as required on platforms that use a 4-byte TCHAR. This conversion is a no-op on platforms that use a 2-byte TCHAR as the string is already assumed to be valid UTF-16 data. It should also be noted that while FString may contain UTF-16 code units on platforms using a 2-byte TCHAR, this change doesn't do anything to make FString represent a Unicode string on those platforms (ie, a string that understands and works on code points), but is rather just a bag of code units.
Two new variable-width string converters have be added to facilitate the conversion (modelled after the TCHAR<->UTF-8 converters), TUTF16ToUTF32_Convert and TUTF32ToUTF16_Convert. These are used for both TCHAR<->UTF16CHAR conversion when needed, but also for TCHAR<->wchar_t conversion on platforms that use char16_t for TCHAR along with having a 4-byte wchar_t (as defined by the new PLATFORM_WCHAR_IS_4_BYTES option).
These conversion routines are accessed either via the conversion macros (TCHAR_TO_UTF16, UTF16_TO_TCHAR, TCHAR_TO_WCHAR, and WCHAR_TO_TCHAR), or by using a conversion struct (FTCHARToUTF16, FUTF16ToTCHAR, FTCHARToWChar, and FWCharToTCHAR), which is the same pattern as the existing TCHAR<->UTF-8 conversion. Both the macros and the structs are defined as no-ops when the conversion isn't needed, but always exist so that code can be written in a portable way.
Very little code actually needed updating to use UTF-16, as the vast majority makes no assumptions about the size of TCHAR, nor how FString should be serialized. The main places were the FString archive serialization and the JSON reader/writer, along with some minor fixes to the UTF-8 conversion logic for platforms using a 4-byte TCHAR.
Tests have been added to verify that an FString representing a UTF-32 code point can be losslessly converted to/from UTF-8 and UTF-16, and serialized to/from an archive.
#jira
#rb Steve.Robb, Josh.Adams
#ROBOMERGE-SOURCE: CL 8676728 via CL 8687863
#ROBOMERGE-BOT: (v421-8677696)
[CL 8688048 by jamie dale in Main branch]
#rb chris.gagnon
#ROBOMERGE-SOURCE: CL 8675446 via CL 8675447 via CL 8687243
#ROBOMERGE-BOT: (v421-8677696)
[CL 8687393 by matt kuhlenschmidt in Main branch]
#jira UE-80025
#rb chris.gagnon
#lockdown cristina.riveron
#ROBOMERGE-SOURCE: CL 8686367 in //UE4/Release-4.23/...
#ROBOMERGE-BOT: RELEASE (Release-4.23 -> Main) (v421-8677696)
[CL 8686368 by sebastian nordgren in Main branch]
#jira UE-80333
#rb matt.hoffman
#lockdown cristina.riveron
#ROBOMERGE-SOURCE: CL 8680881 in //UE4/Release-4.23/...
#ROBOMERGE-BOT: RELEASE (Release-4.23 -> Main) (v417-8656536)
[CL 8680903 by max chen in Main branch]
#rb jeanfrancois.dube
#rnx
#ROBOMERGE-SOURCE: CL 8673040 via CL 8673042 via CL 8676065
#ROBOMERGE-BOT: (v417-8656536)
[CL 8676177 by patrick enfedaque in Main branch]