The former block size traded away good fit within cache lines in
order to gain faster division in deque_item(). However, compilers
are getting smarter and can now replace the slow division operation
with a fast integer multiply and right shift. Accordingly, it makes
sense to go back to a size that lets blocks neatly fill entire
cache-lines.
GCC-4.8 and CLANG 4.0 both compute "x // 62" with something
roughly equivalent to "x * 9520900167075897609 >> 69".
The current pattern of memory access will update both the leftlink and
rightlink at the same time, so they should be positioned side-by-side
for better cache locality.
Keeping the leftlink at the front of the structure would make sense
only if the paired updates were eliminated by backporting changesets
49a9c734304d, 3555cc0ca35b, ae9ee46bd471, and 744dd749e25b. However,
that isn't likely to happen, so we're better off with the leftlink at
the end of the structure.