Note that this is the multiple above the device sample rate, rather than the
source property limit. It could theoretically be increased to 511 by testing
against UINT_MAX instead of INT_MAX, since the increment and positions are
using unsigned integers. I'm just being paranoid about overflows.
Note that it still uses FuMa scalings internally. Coefficients loaded from
config files specify if they're FuMa (in both ordering and scaling) or N3D,
and will get reordered or rescaled as needed.
This reverts commit 7ffb9b3056ab280d5d9408fd023f3cfb370ed103.
It was behaving as appropriate before (orienting left did pan it left for the
listener), I was apparently just misinterpreting the matrix.
The rotation erroneously specified the orientation of the source relative to
the sound field, whereas it should be the orientation of the sound field *and*
source relative to the listener. So now when the source is oriented left, the
front of the sound field is to the left of the listener.
For sources with a non-0 radius:
When distance <= radius, factor = distance/radius*0.5
When distance > radius, factor = 1 - asinf(radius/distance)/PI
Also, avoid using Position after calculating the localized direction and
distance.
This method is intended to help development by easily testing the quality of
the B-Format encode and B-Format-to-HRTF decode. When used with HRTF, all
sources are renderer using the virtual B-Format output, rather than just
B-Format sources.
Despite the CPU cost savings (only four channels need to be filtered with HRTF,
while sources all render normally), the spatial acuity offered by the B-Format
output is pretty poor since it's only first-order ambisonics, so "full" HRTF
rendering is definitely preferred.
It's /possible/ for some systems to be edge cases that prefer the CPU cost
savings provided by basic over the sharper localization provided by full, and
you do still get 3D positional cues, but this is unlikely to be an actual use-
case in practice.
This adds the ability to directly decode B-Format with HRTF, though only first-
order (WXYZ) for now. Second- and third-order would be easilly doable, however
we'd need to be able to up-mix first-order content (from the BFORMAT2D and
BFORMAT3D buffer formats) since it would be inappropriate to decode lower-order
content with a higher-order decoder.
The sound localization with virtual channel mixing was just too poor, so while
it's more costly to do per-source HRTF mixing, it's unavoidable if you want
good localization.
This is only partially reverted because having the virtual channel is still
beneficial, particularly with B-Format rendering and effect mixing which
otherwise skip HRTF processing. As before, the number of virtual channels can
potentially be customized, specifying more or less channels depending on the
system's needs.
This new method mixes sources normally into a 14-channel buffer with the
channels placed all around the listener. HRTF is then applied to the channels
given their positions and written to a 2-channel buffer, which gets written out
to the device.
This method has the benefit that HRTF processing becomes more scalable. The
costly HRTF filters are applied to the 14-channel buffer after the mix is done,
turning it into a post-process with a fixed overhead. Mixing sources is done
with normal non-HRTF methods, so increasing the number of playing sources only
incurs normal mixing costs.
Another benefit is that it improves B-Format playback since the soundfield gets
mixed into speakers covering all three dimensions, which then get filtered
based on their locations.
The main downside to this is that the spatial resolution of the HRTF dataset
does not play a big role anymore. However, the hope is that with ambisonics-
based panning, the perceptual position of panned sounds will still be good. It
is also an option to increase the number of virtual channels for systems that
can handle it, or maybe even decrease it for weaker systems.