Turns out the C version of the cubic resampler is just slightly faster than
even the SSE3 version of the FIR4 resampler. This is likely due to not using a
64KB random-access lookup table along with unaligned loads, both offseting the
gains from SSE.
And fix the output filtering. The modulation code is still there since it's
(probably) technically correct, but the interaction with the feedback loop and
filtering on the output caused improper behavior which needs to be sorted out.
Even though it's taking the address of a member, it's still technically a
derefernce and thus undefined behavior. sizeof doesn't "execute" the
expression, so derefering in it instead is fine.
The core LateReverb_* functions are explicitly written out now, since the
tapping and blending done by the Faded version is a bit more complex and it's
not so easy to ensure proper optimizing on the Unfaded version.
The outputs themselves use a variale-delay tap, but using a separate fixed-
delay tap on the feedback helps improve the perceived "wobble" with sustained
notes. This also applies to the flanger effect.
This will be to allow buffer layering, multiple buffers of the same format and
sample rate that are mixed together prior to resampling, filtering, and
panning. This will allow composing sounds from individual components that can
be swapped around on different invocations (e.g. layer SoundA and SoundB on one
instance and SoundA and SoundC on a different instance for a slightly different
sound, then just SoundA for a third instance, and so on). The longest buffer
within the list item determines the length of the list item.
More work needs to be done to fully support it, namely the ability to specity
multiple buffers to layer for static and streaming sources. Also the behavior
of loop points for layered static sources should be worked out. Should also
consider allowing each layer to have a sample offset.
Now FuMa and ACN channel orders are required, as are FuMa, SN3D, and N3D
normalization schemes. An integer query (alcGetIntegerv) is added for the
maximum ambisonic order.
The context state properties are less likely to change compared to the listener
state, and future changes may prefer more infrequent updates to the context
state.
Note that this puts the MetersPerUnit in as a context state, even though it's
handled through the listener functions. Considering the infrequency that it's
updated at (generally set just once for the context's lifetime), it makes more
sense to put it there than with the more frequently updated listener
properties. The aforementioned future changes would also prefer MetersPerUnit
to not be updated unnecessarily.