Fixes numerous issues in the software renderer caused by bidi
sample unrolling and the messes and hacks it required elsewhere:
* Bidirectional looped samples no longer take up to twice as much RAM.
* IT sustain loops no longer require duplicate sample data.
* IT bidirectional sustain loops should work properly now.
* Other formats that want to use sustain loops should no longer require
sample duplication hacks.
* The software mixer should no longer occasionally skip output samples
when the voice position passes the end of the sample.
* Added regression test to ensure both XM and IT bidirectional sample
loops render the same as their forward loop counterparts.