- Add an optional param to OpenLibrary to not look for shader lib chunks (because there's strong thread contention surrounding FMountedPakFileInfo::KnownPakFilesAccessLock)
- Removed reading of shared-cooked override since that feature is obsolete
#rb Arciel.Rekman
#rnx
[CL 26815876 by dave belanger in ue5-main branch]
The views made by DumpGPU must be created with lifetime extension disabled, so they release the underlying resources immediately after each dump pass.
#rnx
#jira UE-157708
#rb Guillaume.Abadie, Zach.Bethel
[CL 26769211 by mihnea balta in ue5-main branch]
* FShaderJobCache::ProcessFinishedJob needs to lock CompileQueueSection from FShaderCompilingManager to work from any thread. Originally, CompileQueueSection was locked higher up in the call stack in FShaderCompilingManager::SubmitJobs. CompileQueueSection is propagated to the FShaderJobCache constructor to make it available.
* Locking CompileQueueSection there creates deadlock issues if you also lock JobLock from FShaderJobCache, because certain code paths take the locks in the opposite order. To solve this, we must not take both locks at the same time. The two places ProcessFinishedJob are called from job cache logic are both moved outside of JobLock scopes.
* To reduce lock contention, the hash table of active jobs no longer uses JobLock. Game thread related tasks allocate jobs via PrepareJob, adding to the job hash table. An array of allocated jobs is then passed to SubmitJobs, which doesn't itself access the job hash table at all, so sharing the same lock creates a needless dependency. Striping by the high bits of the hash key is used to further reduce contention in cases where multiple threads are generating jobs (not sure if that happens now, but maybe it will in the future).
* A wait free linked list algorithm is used to insert items to the queue of pending jobs that haven't been assigned to a worker. Atomic linked list operations are supported for both head and tail insertion (the latter required for the default FIFO job execution mode). Tail insertion requires maintaining a tail pointer, and thus can't use the original Core singly linked list class, and the doubly linked list class in Core fundamentally can't support atomic operations, so a version of the singly linked list implementation is copied locally to the ShaderCompiler.cpp, and adapted to our purposes.
* Wait free queue insertion is safe across multiple producer threads, which means we only need an FReadScopeLock for insertion. Write locks are required for other list operations, but queue insertion is the massively parallel operation we are most concerned about. Queue removal happens in a much smaller number of manager threads that call GetPendingJobs.
* Because a tail pointer is always maintained for the FIFO, insertion doesn't need to traverse the linked list to find the tail, which is potentially O(N^2), reducing the time a lock is held.
* SerializeOutput no longer occurs inside JobLock scopes for processing existing output and duplicate jobs, again reducing the time a lock is held.
#jira UE-187335 UE-190642
#rnx
#rb jason.nadro dan.elksnitis arciel.rekman
[CL 26760695 by jason hoerner in ue5-main branch]
- PC previews of SM6 platforms weren't going to DXC correctly.
- Preview platform include paths weren't initialized when compiling with local-only shader compiles.
- A few preview platform DDSPI settings weren't being initialized correctly based on the capabilites of the preview format.
#rb arciel.rekman, dan.elksnitis
[CL 26589147 by christopher waters in ue5-main branch]
[FYI] elizabeth.baumel
Original CL Desc
-----------------------------------------------------------------
Only include GPU crash utils when it's enabled, fix warning about missing header guard
#jira UE-191104
#rb trivial
[CL 26581778 by keaton stewart in ue5-main branch]