I kept all the existing PL_DHashTableAdd() calls fallible, in order to be
conservative, except for the ones in nsAtomTable.cpp which already were
followed immediately by an abort on failure.
As explained in bug 1111355, having avx enabled appears to change the
alignment behavior of alloca (apparently adding an extra 16 bytes) of
padding/alignment (and using 32-byte alignment instead of 16-byte). The
suggestion of using __bultin_alloca_with_align in bug 1111355 didn't fix
the problem, so this seems to be the best available workaround, given
that this code, which should perhaps better be written in assembly, is
written in C++.
Interestingly, this is NOT fixed by #pragma GCC target ("arch=x86-64").
(I determined the (undocumented) name for the default -march value on
x86_64 from the gcc source code (gcc/config/i386/i386.c, function
ix86_option_override_internal, code that sets opts->x_ix86_arch_string .)
I confirmed that this sets the same macros based on the empty diff
between the output of 'gcc -E -dM -x c++ /dev/null' and 'gcc -E -dM -x
c++ -march=x86-64 /dev/null', which was not an empty diff for other
-march values (e.g., k8).)
I confirmed that the push_options and pop_options actually work by
putting the push/pop pair around a different (earlier) function, and
testing that this did not fix the bug (with the pop_options before
NS_InvokeByIndex).
See the gcc documentation at:
https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Function-Specific-Option-Pragmas.htmlhttps://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Function-Attributes.htmlhttps://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/i386-and-x86-64-Options.html
I kept all the existing PL_DHashTableAdd() calls fallible, in order to be
conservative, except for the ones in nsAtomTable.cpp which already were
followed immediately by an abort on failure.
Because they are now just equivalent to |new PLDHashTable()| +
PL_DHashTableInit() and PL_DHashTableFinish(t) + |delete t|, respectively.
They're only used in a handful of places and obscure things more than they
clarify -- I only recently worked out exactly how they different from Init()
and Finish().
Because it's no longer needed now that entry storage isn't allocated there.
(The other possible causes of failures are much less interesting and simply
crashing is a reasonable thing to do for them.)
This also makes PL_DNewHashTable() infallible.
This makes zero-element hash tables, which are common, smaller, and also avoids
unnecessary malloc/free pairs.
I did some measurements during some basic browsing of a few sites. I found that
35% of all live tables were empty with a few tabs open. And cumulatively, for
the whole session, 45% of tables never had an element added to them.
There is more to be done w.r.t. simplifying initialization, which will occur in
the next patch.
PL_DHashTableLookup() had the same return protocol as SearchTable(), which
meant that compilers could generate a tail call from the former to the latter.
Bug 1124973 replaced PL_DHashTableLookup() with PL_DHashTableSearch(), which
has a slightly different return protocol, and so the tail call was lost. This
appears to be the cause of the Fennec performance regression.
This patch splits SearchTable() in two (using templates to avoid explicit code
duplication). SearchTable<false>() now has the same return protocol as
PL_DHashTableSearch(), and so the tail call can now be used again.