This appears to be how Creative's Windows drivers handle it, and is necessary
for at least the Windows version of UT2k4 (otherwise it tries to play a source
while suspended, checks and sees it's stopped, then kills it before it's given
a chance to start playing).
Consequently, the internal properties it gets mixed with are determined by what
the source properties are at the time of the play call, and the listener
properties at the time of the suspend call.
This does not change alDeferUpdatesSOFT, which will still hold the play state
change until alProcessUpdatesSOFT.
Note that this now also causes all playing sources to update when an effect
slot is updated. This is a bit wasteful, as it should only need to re-update
sources that are using the effect slot (and only when a relevant property is
changed), but it's good enough. Especially with deferring since all playing
sources are going to get updated on the process call anyway.
This allows us to not have to play around with trying to avoid duplicate state
pointers, since the reference count will ensure they're deleted as appropriate.
The only caveat is that the mixer is not allowed to decrement references, since
that can cause the object to be freed (which the mixer code is not allowed to
do).
The source's voice holds a copy of the last properties it received, so listener
updates can make sources recalculate internal properties from that stored copy.
Last time this attempted to average the HRIRs according to their contribution
to a given B-Format channel as if they were loudspeakers, as well as averaging
the HRIR delays. The latter part resulted in the loss of the ITD (inter-aural
time delay), a key component of HRTF.
This time, the HRIRs are averaged similar to above, except instead of averaging
the delays, they're applied to the resulting coefficients (for example, a delay
of 8 would apply the HRIR starting at the 8th sample of the target HRIR). This
does roughly double the IR length, as the largest delay is about 35 samples
while the filter is normally 32 samples. However, this is still smaller the
original data set IR (which was 256 samples), it also only needs to be applied
to 4 channels for first-order ambisonics, rather than the 8-channel cube. So
it's doing twice as much work per sample, but only working on half the number
of samples.
Additionally, since the resulting HRIRs no longer rely on an extra delay line,
a more efficient HRTF mixing function can be made that doesn't use one. Such a
function can also avoid the per-sample stepping parameters the original uses.
Currently incomplete, as second- and third-order output will not correctly
handle B-Format input buffers. A standalone up-sampler will be needed, similar
to the high-quality decoder.
Also, output is ACN ordering with SN3D normalization. A config option will
eventually be provided to change this if desired.
Sometimes the mixer is temporarily prevented from applying updates, when
multiple sources need to be updated simultaneously for example, but does not
prevent mixing. If the mixer runs during that time and a voice was just
started, it would've mixed the voice without any internal properties being set
for it.
This will also allow backends to better synchronize the tracked clock time with
the device output latency, without necessarily needing to lock if the backend
API can allow for it.