Since Uint32 can't be represented in a MIRType_Int32, this function should
return a MIRType_Double.
Allow MSimdExtractElement(Uint32x4) to return a MIRType_Int32 too. It will work
like the double version followed by MTruncateToInt32 which bitcasts the Uint32
value range into the Int32 value range.
Add a MSimdBinaryComp::AddLegalized function which expands unsigned compares on
target platforms that don't support them directly. The early expansion exposes
the constants to MIR optimizations.
Unsigned comparison is expressed in terms of signed comparison by offsetting
both sides by INT_MIN.
The conversion from Uint32x4 to Float32x4 is not available as an SSE
instruction, so we need to expand into a larger instruction sequence lifted
from LLVM. Make this expansion early when generating MIR so that it can be
exposed to LICM and GVN optimizations.
The conversion from Float32x4 to Uint32x4 can throw a RangeError. It is handled
similarly to LFloat32x4ToInt32x4. This expansion depends on the details of the
cvttps2dq instruction that can't be expressed in MIR, so it can't be expanded
early.
Add a new InlinableNative::SimdUint32x4 enumerator, and emit the corresponding
JSJitInfo objects in SIMD.cpp.
Start producing template objects for Uint32x4 operations in BaselineIC.cpp.
Add a new SimdSign enum class to SIMD.h which will be used to distinguish
between signed and unsigned integers in the few places where it matters.
Map the SIMD.Uint32x4 type to the existing MIRType_Int32x4 + SimdSign::Unsigned.
Map SIMD.Int32x4 to MITType_Int32x4 + SimdSign::Signed.
Add a 'SimdSign sign' argument to those inlineSimd...() functions that care.
Some MIR instructions will get similar fields in the following commits.
For now, abort inlining if unsigned vectors are actually encountered. These cases
will be fixed in the following commits.
Extract the code that generates template objects for SIMD operations, and
rewrite it to use the JSJitInfo nativeOp encoding.
This avoids the native function pointer comparisons, and it makes it simpler to
add new SIMD types and operations.
The extractLane(), anyTrue(), and allTrue() SIMD functions produce scalar
values, and so they don't need a template object. The canInlineSimd() function
was rejecting these functions because of the missing template object.
At the same time, explicitly avoid inlining any SIMD operations if the JIT does
not support SIMD. This was previously controlled by the absense of the template
object.
This saves some code size in a cold function, and it makes it possible to pass
in the SIMD type as a dynamic argument.
Also detemplatize the static CreateSimdType() to save some code size.
Replace all of the Get*TypeRepr() self-hosting functions with a single
GetSimdTypeDescr() which takes one of the JS_SIMDTYPEREPR_* constants as an
argument instead.
Total code shrink ~ 32 KB.
This commit makes the ByFilename census counter create its own owned copies of
script filenames. If we don't do this, and the heap graph we are analyzing is
the live heap, then the ScriptSource (from which we get the filename) could
disappear out from under us. We can't use a ScriptSourceHolder to keep the
ScriptSource alive because we might be analyzing an offline heap snapshot, in
which case there is no ScriptSource at all.
Frequently, the mutator will modify nearly the same elements of an object
repeatedly. However, because the set of elements aren't exactly the same, the
single item buffer in front of MonoTypeBuffer can't de-duplicate these
edges. For example, in one CodeMirror test case, we would add 245 SlotsEdges
entries for almost the same 50,000 elements in an object, causing us to trace
these same 50,000 elements 245 times!
This patch makes `js::gc::StoreBuffer::putSlot` check to see if the new range is
overlapping with the last range added, and if so, merge the ranges rather than
adding partially duplicated elements into the store buffer.
This gives a 1000 point increase on Octane's pdf.js subsuite locally. The
CodeMirror test case mentioned above goes from ~10 seconds execution time to
~1.5 seconds, with the max minor gc pause dropping from up to 40 milliseconds,
down to 4 milliseconds.