External functions should no longer access FVectorVMContext directly and instead use the helper member functions in the NiagaraDataInterfaceFunctionContext class... this should have zero effect without NIAGARA_EXP_VM defined... if it is defined then it'll use the experimental VM functions, which are not included in this CL.
#rb shaun.kime #jira none #okforgithub public
#ROBOMERGE-SOURCE: CL 16908052 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v836-16769935)
[CL 16908059 by shawn mcgrath in ue5-release-engine-test branch]
-changes the function table to be an array of function pointers to allow for the redirection without copies/reallocations
-eliminates 2nd most common allocation in cases with large instance counts in niagara
#rb stu.mckenna
#ROBOMERGE-SOURCE: CL 12146347 via CL 12146348
#ROBOMERGE-BOT: (v659-12123632)
[CL 12146350 by rob krajcarski in Main branch]
#RB Rob.Krajcarski, Stu.McKenna
#JIRA UE-84463
#ROBOMERGE-OWNER: arne.schober
#ROBOMERGE-AUTHOR: arne.schober
#ROBOMERGE-SOURCE: CL 12105113 via CL 12121713
#ROBOMERGE-BOT: (v657-12064184)
[CL 12121717 by arne schober in Main branch]
* added functions which can retrieve the IDs of the particles spawned in the last tick
* added missing barriers on source buffers when reading from another emitter
* fixed bug when more than one DI function was used in a module (bad alignment on the elements of the array of attribute indices)
Persistent IDs:
* replaced ClearUAV with a custom shader, so we can set sane barriers instead of the stuff that ClearUAV() forces. Dev-Rendering replaced ClearUAV() with a better API, but I didn't want to wait for that merge.
* grouped together all the ID buffer clears before the simulation dispatches, and all the free ID list updates after rendering, so they can overlap and to avoid stalls
Barriers:
* fixed a bogus compute to graphics transition on the ID to index buffer
* when running multiple ticks, only the outputs of the final tick are transitioned to graphics
* automatic CS cache flushes are disabled once before running all the ticks and enabled at the end, instead of doing it for each tick
#jira UE-84717
#rb Stu.McKenna
#ROBOMERGE-OWNER: mihnea.balta
#ROBOMERGE-AUTHOR: mihnea.balta
#ROBOMERGE-SOURCE: CL 11163241 via CL 11163243
#ROBOMERGE-BOT: (v640-11091645)
[CL 11167081 by mihnea balta in Main branch]
-reduces fixed concurrent CPU costs of emitters by about 15%
-emitter parameter storage remains in place for non-system parameters
#rb stu.mckenna, simon.tovey
#ROBOMERGE-OWNER: rob.krajcarski
#ROBOMERGE-AUTHOR: rob.krajcarski
#ROBOMERGE-SOURCE: CL 11082540 via CL 11082550
#ROBOMERGE-BOT: (v637-11041722)
[CL 11082603 by rob krajcarski in Main branch]
#rnx
#rb none
#ROBOMERGE-OWNER: ryan.durand
#ROBOMERGE-AUTHOR: ryan.durand
#ROBOMERGE-SOURCE: CL 10869210 via CL 10869511 via CL 10869900
#ROBOMERGE-BOT: (v613-10869866)
[CL 10870549 by ryan durand in Main branch]
Left packs the data with shuffles instead of branching on if a particle is valid, and claims blocks of indices rather than one at a time.
VectorIntShuffle intrinsic has been added locally to the VectorVM and is only included for selected platforms
#rb stu.mckenna, simon.tovey
#ROBOMERGE-SOURCE: CL 10594706 via CL 10594709
#ROBOMERGE-BOT: (v607-10590470)
[CL 10594712 by rob krajcarski in Main branch]
Left packs the data with shuffles instead of branching on if a particle is valid, and claims blocks of indices rather than one at a time.
#rb stu.mckenna, simon.tovey
#ROBOMERGE-SOURCE: CL 10593082 via CL 10593089
#ROBOMERGE-BOT: (v607-10590470)
[CL 10593090 by rob krajcarski in Main branch]
[FYI] nicholas.goldstein
#rnx
#ROBOMERGE-SOURCE: CL 10062477 via CL 10062526
#ROBOMERGE-BOT: (v574-10069753)
[CL 10071545 by stu mckenna in Main branch]
- Read ByteCode directly if unalgned loads are supported to condense 3 opts into 1 when reading a uint16
- Pre-calculate instance loops for VECTOR_WIDTH_FLOATS to avoid each vector op having to round / divide instance counts
- Added runtime optimization of the VM script, this currently boils down to a function call per VM invoke + storing the data required
- Use vm.OptimizeVMByteCode to enable optimized code generation
- Use vm.UseOptimizedVMByteCode to enable running optimized code rather the traditional byte code
[FYI] simon.tovey,rob.krajcarski,shaun.kime
#rnx
#ROBOMERGE-SOURCE: CL 9962648 via CL 9964955
#ROBOMERGE-BOT: (v560-9963197)
[CL 9965540 by stu mckenna in Main branch]
- VM is now directly fed a set of pre generated register tables from the datasets.
- Split the monolithic register table in the VM up so there are explicit I/O and temp register tables the script indexes into directly.
- Avoids some recreation of expensve objects in favour of manual reset calls.
- Re-wrote Oupt kernel to be more explicit. Going via templated handler in this case didn't get us any code reuse and just obfuscated it's workings.
Saves ~10-25us of overhead per VM involcation which soon adds up.
Saves ~1-2us inside each VM exec itself.
#rb Stu.Mckenna
#ROBOMERGE-SOURCE: CL 9743796 via CL 9743798
#ROBOMERGE-BOT: (v542-9736015)
[CL 9745804 by simon tovey in Main branch]
#rb none
[FYI] shaun.kime, arne.shober
#ROBOMERGE-SOURCE: CL 9031014 via CL 9042170 via CL 9042350
#ROBOMERGE-BOT: (v443-9013191)
[CL 9042481 by simon tovey in Main branch]
- Optimizing temp register layout for better cache usage.
- Moving VM context from TLS to pool.
#rb Stu.Mckenna, Shaun.Kime
#ROBOMERGE-OWNER: simon.tovey
#ROBOMERGE-AUTHOR: simon.tovey
#ROBOMERGE-SOURCE: CL 8886792 via CL 8886955 via CL 8889429
#ROBOMERGE-BOT: (v427-8887818)
[CL 8890014 by simon tovey in Main branch]
-------------------
New "Object" parameter type for Niagara.
These allow you to pass in Raw UObject pointers for use in data interfaces or renderers.
Current use case is to pass in Skeletal Mesh references from the outside.
This avoids having to create a whole new data interface when passing in external object references.
Other use cases could be passing in;
- material overrides
- Static mesh references to override the mesh a renderer uses.
- Basically any UObject a DI or renderer may use.
These are weakly typed, meaning that they have to be UObjects but beyond that, they can be anything.
So in theory you could pass in a Material to these Skeletal mesh references.
It's on the using code to validate and act appropriately.
This weak typing does allow a gread deal of flexibility though.
E.g. you can now pass in a Skeletal Mesh Actor OR a SkeletalMeshComponent to Niagara via the same parameter and the DI will interpret it accordingly.
The ability to pass in direct sub component refs is something cascade could never do.
Deterministic Randoms for Data Interfaces
-------------------------------------------
To support functional tests for the object parameters, this CL also includes deterministic randoms for DIs.
Random seeds for deterministic randoms are now packaged up into a NiagaraRandomInfo that can be passed to DI functions.
There is a helper class that will handle the parameter and let you generate randoms that are deterministic or not based on the passed NiagaraRandomInfo.
Non deterministic is indicated using a special case value of 0xFFFFFFFF for Seed3.
Misc
-------------
Some utility additions to int vector class.
Slight refactor of mesh helper for the skel mesh DI.
Fixed issue with incorrect delta time on first tick of skel mesh DI.
Work Needed
------------------------
UI customization for these parameters in the stack and the component override panel.
As these are typeless probably the best thing here is to have the stack UI be an asset picker and the component UI be an actor picker?
Is there such a thing as a combined asset/actor picker? IDK.
Deterministic Randoms need to be utilized in other DIs/Functions.
#rb Frank.Fella, Shaun.Kime
#ROBOMERGE-SOURCE: CL 7288634 via CL 7301760
#ROBOMERGE-BOT: (v370-7290619)
[CL 7302160 by simon tovey in Main branch]
Fix for ExecutionState crash with many emitters.
[FYI] Stu.Mckenna
#rb none
#ROBOMERGE-SOURCE: CL 6676038 via CL 6678350
#ROBOMERGE-BOT: (v363-6677109)
[CL 6678803 by shaun kime in Main branch]
Skeletal Mesh DI interpolation
> Skeletal mesh interface can now optionally interpolate in it's GetSkinnedData function.
> Updated SampleSkeletalMeshSurface module.
> New utility op GetSpawnInterpolation(). Could be useful for other things.
#jira none
#fyi frank.fella, wyeth.johnson
#rb Morten.Vassvik
[CL 4949680 by Bob Tellez in Main branch]