The code used to update the output ID to index table, but that left stale entries in the input buffer set. When particles died, one of the buffers had -1 for the dead IDs, but the other still had the old execution indexes. This could lead to direct reads returning bogus data, instead of reporting that the ID is no longer in use. The solution is to clear the ID table at the beginning of the tick, and let the UpdateID() function fill in the IDs which are still in use.
This problem didn't occur on GPU, where the tables were already being cleared.
#rb none
#ROBOMERGE-SOURCE: CL 11292467 via CL 11292468
#ROBOMERGE-BOT: (v647-11244347)
[CL 11293007 by mihnea balta in Main branch]
The UpdateID CPU kernel didn't take into account the instance start offset, so the spawn step mapped new IDs to indices starting at 0. If the emitter already contained particles, this caused the ID to index mapping to be incorrect in the frame when the spawn happened, which caused further problems if another emitter tried to read the particles in that frame.
#rb none
#ROBOMERGE-SOURCE: CL 11290626 via CL 11290627
#ROBOMERGE-BOT: (v647-11244347)
[CL 11290628 by mihnea balta in Main branch]
* added functions which can retrieve the IDs of the particles spawned in the last tick
* added missing barriers on source buffers when reading from another emitter
* fixed bug when more than one DI function was used in a module (bad alignment on the elements of the array of attribute indices)
Persistent IDs:
* replaced ClearUAV with a custom shader, so we can set sane barriers instead of the stuff that ClearUAV() forces. Dev-Rendering replaced ClearUAV() with a better API, but I didn't want to wait for that merge.
* grouped together all the ID buffer clears before the simulation dispatches, and all the free ID list updates after rendering, so they can overlap and to avoid stalls
Barriers:
* fixed a bogus compute to graphics transition on the ID to index buffer
* when running multiple ticks, only the outputs of the final tick are transitioned to graphics
* automatic CS cache flushes are disabled once before running all the ticks and enabled at the end, instead of doing it for each tick
#jira UE-84717
#rb Stu.McKenna
#ROBOMERGE-OWNER: mihnea.balta
#ROBOMERGE-AUTHOR: mihnea.balta
#ROBOMERGE-SOURCE: CL 11163241 via CL 11163243
#ROBOMERGE-BOT: (v640-11091645)
[CL 11167081 by mihnea balta in Main branch]
-reduces fixed concurrent CPU costs of emitters by about 15%
-emitter parameter storage remains in place for non-system parameters
#rb stu.mckenna, simon.tovey
#ROBOMERGE-OWNER: rob.krajcarski
#ROBOMERGE-AUTHOR: rob.krajcarski
#ROBOMERGE-SOURCE: CL 11082540 via CL 11082550
#ROBOMERGE-BOT: (v637-11041722)
[CL 11082603 by rob krajcarski in Main branch]
- If the compiler chooses to fold both the Optimize & regular VM function calls together we will read garbage
#rb joe.barnes,ben.woodhouse
#rnx
#ROBOMERGE-OWNER: stu.mckenna
#ROBOMERGE-AUTHOR: stu.mckenna
#ROBOMERGE-SOURCE: CL 10907287 via CL 10907300 via CL 10907449 via CL 10907456 via CL 10907462
#ROBOMERGE-BOT: (v626-10872990)
[CL 10907468 by stu mckenna in Main branch]
#rnx
#rb none
#ROBOMERGE-OWNER: ryan.durand
#ROBOMERGE-AUTHOR: ryan.durand
#ROBOMERGE-SOURCE: CL 10869210 via CL 10869511 via CL 10869900
#ROBOMERGE-BOT: (v613-10869866)
[CL 10870549 by ryan durand in Main branch]
- Fallback to synchronous optimizing when free is enabled so that we can free the source until I figure out a safe way to do this
[FYI] shaun.kime,simon.tovey,rob.krajcarski
#rnx
#ROBOMERGE-SOURCE: CL 10667538 via CL 10668398 via CL 10668414
#ROBOMERGE-BOT: (v609-10634694)
[CL 10668428 by stu mckenna in Main branch]
[CODEREVIEW] simon.tovey,rob.krajcarski
#rnx
#ROBOMERGE-SOURCE: CL 10621008 via CL 10621900 via CL 10621993
#ROBOMERGE-BOT: (v608-10590470)
[CL 10622045 by stu mckenna in Main branch]
Left packs the data with shuffles instead of branching on if a particle is valid, and claims blocks of indices rather than one at a time.
VectorIntShuffle intrinsic has been added locally to the VectorVM and is only included for selected platforms
#rb stu.mckenna, simon.tovey
#ROBOMERGE-SOURCE: CL 10594706 via CL 10594709
#ROBOMERGE-BOT: (v607-10590470)
[CL 10594712 by rob krajcarski in Main branch]
Left packs the data with shuffles instead of branching on if a particle is valid, and claims blocks of indices rather than one at a time.
#rb stu.mckenna, simon.tovey
#ROBOMERGE-SOURCE: CL 10593082 via CL 10593089
#ROBOMERGE-BOT: (v607-10590470)
[CL 10593090 by rob krajcarski in Main branch]
#rb stu.mckenna
#ushell-cherrypick of 10103273 by john.hable
#ushell-cherrypick of 10114181 by john.hable
#ROBOMERGE-SOURCE: CL 10481339 via CL 10481342
#ROBOMERGE-BOT: (v605-10478255)
[CL 10481343 by rob krajcarski in Main branch]
- Warning is because a compare flags against a uint32 value and it assumes you should be doing an & == v which isn't correct in this instance
#rb none
#jira UE-83124
#rnx
#ROBOMERGE-SOURCE: CL 10034568 via CL 10034570
#ROBOMERGE-BOT: (v565-10026848)
[CL 10034571 by stu mckenna in Main branch]
#jira UE-83124
#rb none
#rnx
#ROBOMERGE-SOURCE: CL 10017401 via CL 10017403
#ROBOMERGE-BOT: (v562-10004402)
[CL 10017405 by stu mckenna in Main branch]
- Read ByteCode directly if unalgned loads are supported to condense 3 opts into 1 when reading a uint16
- Pre-calculate instance loops for VECTOR_WIDTH_FLOATS to avoid each vector op having to round / divide instance counts
- Added runtime optimization of the VM script, this currently boils down to a function call per VM invoke + storing the data required
- Use vm.OptimizeVMByteCode to enable optimized code generation
- Use vm.UseOptimizedVMByteCode to enable running optimized code rather the traditional byte code
[FYI] simon.tovey,rob.krajcarski,shaun.kime
#rnx
#ROBOMERGE-SOURCE: CL 9962648 via CL 9964955
#ROBOMERGE-BOT: (v560-9963197)
[CL 9965540 by stu mckenna in Main branch]
- VM is now directly fed a set of pre generated register tables from the datasets.
- Split the monolithic register table in the VM up so there are explicit I/O and temp register tables the script indexes into directly.
- Avoids some recreation of expensve objects in favour of manual reset calls.
- Re-wrote Oupt kernel to be more explicit. Going via templated handler in this case didn't get us any code reuse and just obfuscated it's workings.
Saves ~10-25us of overhead per VM involcation which soon adds up.
Saves ~1-2us inside each VM exec itself.
#rb Stu.Mckenna
#ROBOMERGE-SOURCE: CL 9743796 via CL 9743798
#ROBOMERGE-BOT: (v542-9736015)
[CL 9745804 by simon tovey in Main branch]
#rb none
[FYI] shaun.kime, arne.shober
#ROBOMERGE-SOURCE: CL 9031014 via CL 9042170 via CL 9042350
#ROBOMERGE-BOT: (v443-9013191)
[CL 9042481 by simon tovey in Main branch]
- Optimizing temp register layout for better cache usage.
- Moving VM context from TLS to pool.
#rb Stu.Mckenna, Shaun.Kime
#ROBOMERGE-OWNER: simon.tovey
#ROBOMERGE-AUTHOR: simon.tovey
#ROBOMERGE-SOURCE: CL 8886792 via CL 8886955 via CL 8889429
#ROBOMERGE-BOT: (v427-8887818)
[CL 8890014 by simon tovey in Main branch]
- Moves Niagara tick work into tick function in LastDemotable. In future CL we can push other work into earlier tick groups.
- Can now optioally push system simulation work off the GT.
- Can now optionally push system instance tick work off the GT in batches. Currently of 8.
- Refactor of system simulation code to be far more clear and maintainable.
- Modified spawn flow to facilitate ticking inside main world tick. Now we spawn all our new systems at the end of the frames actor ticking so we catch any new systems spwaned during the frame.
- Various infrastructure changes to support the above.
- Misc:
Changed some checks to check slow.
Fixed a UI crash
Removed VM stat scope that was too granular.
Moved some static functions relating to the world manager actually into the world manager rather than in the NiagaraModule
#rb Shaun.kime
#ROBOMERGE-SOURCE: CL 8210774 via CL 8212467
#ROBOMERGE-BOT: (v401-8057353)
[CL 8212517 by simon tovey in Main branch]
Turn on via vm.DetailedVMScriptStats
#rb Shaun.kime
#ROBOMERGE-SOURCE: CL 8008309 via CL 8014925
#ROBOMERGE-BOT: (v396-7974030)
[CL 8015123 by simon tovey in Main branch]