Luckily A is in the same place and the same width, so we can do this for
all but framebuffers easily.
Technically we could do it in OpenGL as well.
Small (1-2%) performance improvement in FF2.
If only parameters change (like wrapping or clut, etc.) we don't need to
rehash the data - we know it hasn't changed.
Should reduce the distance between lazy texture hashing on and off.
This will become really powerful if we add some code to the vertex decoder
to check for non-full alpha in the vertices, and set gstate_c.vertexFullAlpha if none
is found (probably want to do the reverse, set it to true and clear if any non-255 alpha is found).
Alpha testing is a performance killer on many mobile GPUs so big efforts to
avoid it can be worth it.
It's actually already pretty decent (unlike the softgpu), but there were a
few places it could use a bit of help. Speeds up things with hardware
transform off, or areas that need to use software transform.
Seems like anything from 4 is supported in 8888, most likely it just needs
to align to 16 bytes. Values above 1024 work, but e.g. 2044 seems buggy.
Fixes the map on Hexyz Force (rendered at 80 stride.)