They are kept around for the sake of the standalone glue, which is used
for e.g. webapprt, which doesn't have direct access to jemalloc, and thus
still needs a wrapper to go through the xpcom function list and get to
jemalloc from there.
Sometimes, at least on Linux, DMDFuncs::sSingleton's static initializer
(in libxul) was being called before sDMDBridge's (in libdmd).
Thus sDMDBridge wasn't constructed yet in the path where its
address is taken, passed down through {replace_,}get_bridge to
ReplaceMallocBridge::Get, and its mVersion field is read.
This patch uses dynamic allocation, following what's done for other
globals in the same situation in this file.
Also, naming convention fix: leading "s" is for C++ class statics;
C-style static globals should be "g".
We need to use _impl variants within mozalloc.h when they are defined because
of how mozglue.dll is linked on Windows, where using malloc/free would use
the symbols from the MSVCRT instead of ours.
TlsGetValue has a semantic difference with pthread_getspecific, in that it
can return a non-error NULL value, so it always sets the LastError.
But allocator callers may not be expecting calling e.g. free() to change
the value of the last error, so preserve it.
The new "num" property lets identical blocks be aggregated in the output. This
patch only uses the "num" property for dead blocks, because that's where the
greatest potential benefit lies, but it could be used for live blocks as well.
On one test case (a complex PDF file) running with --mode=cumulative
--sample-below=1 this patch had the following effects.
- Change in running speed was negligible.
- Compressed output file size dropped from 8.8 to 5.0 MB.
- Compressed output file size dropped from 297 to 50 MB.
- dmd.py runtime (without stack fixing) dropped from 30 to 8 seconds.
The infinite loop happens if chunk_alloc_arena needs to be called when a0malloc
is called. It in turn calls chunk_alloc_default, which uses tsd, which calls
a0malloc if it's the first time the tsd is being gotten from the current thread.
tsd only uses a0malloc on platforms where there is no native thread local storage
support, which, for Mozilla, essentially means anything that is not Linux.
But the tsd is only neededto get the dss precedence setting of the given arena.
That setting has no effect when dss is disabled, which it is on Windows and Mac.
Moreover, the default setting for dss precedence is "secondary", which means
jemalloc only tries dss after it failed to get memory with mmap/VirtualAlloc.
Considering the cases where mmap/VirtualAlloc would fail essentially means
there is shortage of address space, sbrk() is not going to have much more
success, so we might as well disable dss support on all platforms, avoiding
the infinite loop problem on Android and B2G as well.