We trim the initial few samples out of the opus decoder,
to give the output time to converge, and to correct for
the encoding delay. Encoders store the delay in the preskip
field of the Ogg encapsulation header.
The previous code to do this was a hack based on the granulepos
values and could fail on some inputs. Instead, keep a count
of how many samples we want to trip, and remove packet data
until that value matches the preskip value from the header.
The value is set to the preskip value from the header when
the decoder is initialized. We also need to do this after
seek. To do this we add a specialized nsOggReader::ResetDecode
method which takes a boolean argument, set to true when
we are seeking to the start of the stream. In that case,
the method resets the skip count.
There is still an issue after general seeks. The spec recommends
trimming a full 80 ms (3840 frames) to allow the decoder to fully
settle from the previous state. It's tricky to do this inside
nsOpusState because it doesn't know where it is in the stream.
Also add some debug output to track the decode behaviour.
On 2012 May 10, the Ogg encapsulation spec for Opus at
https://wiki.xiph.org/OggOpus bumped the version number
from zero to one. The one-byte field is also now notionally
split into major and minor subfields, with incompatible
changes signalled by the major field.
We update nsOpusState::DecodeHeader to parse the version
field separately from the stream identification and reject
any stream where the high four bits of the version field
is non-zero.
The opus-tools repo was updated 2012 May 22 to set with
version = 1. This commit enables playback of those files.
For media resources whose streams are captured before the load has started, we shouldn't even start
an audio thread. This saves a lot of resources and ensures we don't see races between the audio thread
and the code that copies packets from the audio queue to the MediaStreams.
The first part just handles the case where nsAudioStream failed to allocate a stream. It won't be playing
anything, so instead of trying to get the audio position, just fall back to the media graph current time.
Otherwise GetPositionInFrames returns -1 and things go badly from there.
The second part simplifies the calculation of the next mCurrentTime to just make it based on real time.
We had some code to not let it advance past the end of a stream's buffer, but the next part will make that
unnecessary.
The third part is the real fix. When the new current time has advanced past mBlockingDecisionsMadeUntilTime,
that means the control loop didn't run in time to replenish the audio output buffers and keep up with its
other duties. Effectively all streams have been blocked between mBlockingDecisionsMadeUntilTime and
the new current time. Account for that by adding the difference as extra blocked time for every stream.
We only need to ensure that the stream is marked blocked from mBlockingDecisionsMadeUntilTime indefinitely
far into the future, and then update mBlockingDecisionsMadeUntilTime to the new current time, because the
code takes into account that only blocking decisions up to mBlockingDecisionsMadeUntilTime are valid.