Currently the only way SSE 4.1 is detected is by using __get_cpuid, i.e. with
GCC. Windows' IsProcessorFeaturePresent does not report SSE4.1 capabilities.
They are still there for auxiliary sends. However, they should go away soon
enough too, and then we won't have to mess around with calculating extra
"predictive" samples in the mixer.
The purpose of this is to provide a safe way to be able to "swap" resources
used by the mixer from other threads without the need to block the mixer, as
well as a way to track when mixes have occurred. The idea is two-fold:
It provides a way to safely swap resources. If the mixer were to (atomically)
get a reference to an object to access it from, another thread would be able
allocate and prepare a new object then swap the reference to it with the stored
one. The other thread would then be able to wait until (count&1) is clear,
indicating the mixer is not running, before safely freeing the old object for
the mixer to use the new one.
It also provides a way to tell if the mixer has run. With this, a thread would
be able to read multiple values, which could be altered by the mixer, without
requiring a mixer lock. Comparing the before and after counts for inequality
would signify if the mixer has (started to) run, indicating the values may be
out of sync and should try getting them again. Of course, it will still need
something like a RWLock to ensure another (non-mixer) thread doesn't try to
write to the values at the same time.
Note that because of the possibility of overflow, the counter is not reliable
as an absolute count.
This is for unpacking (reading, e.g. alBufferData) and packing (writing, e.g.
alGetBufferSamplesSOFT) operations. The alignments are specified in sample
frames, with 0 meaning the default (65 for IMA4, 1 otherwise). IMA4 alignment
must be a multiple of 8, plus 1 (e.g. alignment = n*8 + 1), otherwise an error
will occur during (un)packing. Chenging the block alignment does not affect
already-loaded sample data, only future unpack/pack operations... so for
example, this is perfectly valid:
// Load mono IMA4 data with a block alignment of 1024 bytes, or 2041 sample
// frames.
alBufferi(buffer, AL_UNPACK_BLOCK_ALIGNMENT_SOFT, 2041);
alBufferData(buffer, AL_FORMAT_MONO_IMA4, data, data_len, srate);
alBufferi(buffer, AL_UNPACK_BLOCK_ALIGNMENT_SOFT, 0);
Uses AL_MONO_SOFT, AL_RIGHT_SOFT, and AL_LEFT_SOFT. "Linked" samples types
aren't explicitly supported due to being under-defined in the SF2 spec, nor are
ROM samples currently.