Last time this attempted to average the HRIRs according to their contribution
to a given B-Format channel as if they were loudspeakers, as well as averaging
the HRIR delays. The latter part resulted in the loss of the ITD (inter-aural
time delay), a key component of HRTF.
This time, the HRIRs are averaged similar to above, except instead of averaging
the delays, they're applied to the resulting coefficients (for example, a delay
of 8 would apply the HRIR starting at the 8th sample of the target HRIR). This
does roughly double the IR length, as the largest delay is about 35 samples
while the filter is normally 32 samples. However, this is still smaller the
original data set IR (which was 256 samples), it also only needs to be applied
to 4 channels for first-order ambisonics, rather than the 8-channel cube. So
it's doing twice as much work per sample, but only working on half the number
of samples.
Additionally, since the resulting HRIRs no longer rely on an extra delay line,
a more efficient HRTF mixing function can be made that doesn't use one. Such a
function can also avoid the per-sample stepping parameters the original uses.
Currently incomplete, as second- and third-order output will not correctly
handle B-Format input buffers. A standalone up-sampler will be needed, similar
to the high-quality decoder.
Also, output is ACN ordering with SN3D normalization. A config option will
eventually be provided to change this if desired.
Instead of looping over all the coefficients for each channel with multiplies,
when we know only one will have a non-0 factor for ambisonic mixing buffers,
just index the one with a non-0 factor.
By default, stereo outputs using UHJ, which renders to a B-Format buffer that
gets run through a UHJ encoder for output. It may also output using pair-wise
panning, which pans between -30 and +30 degrees with the speakers at the two
ends. In both cases, the stereo coefficients are ignored.
Mono, having only one output channel, can realistically only attenuate its
channel. Turning the volume up and down accomplishes the same result.
Could really do with some optimizations to the mixing gain calculations. For
ambisonic targets, the coefficients will only have 1 non-0 entry for each
output, so the double loop in unnecessarily wasteful. Similarly, most uses
won't need a full height encoding either, so a horizontal-only or mixed-order
target could reduce the number of channels.
This uses a virtual B-Format buffer for mixing, and then uses a dual-band
decoder for improved positional quality. This currently only works with first-
order output since first-order input (from the AL_EXT_BFROMAT extension) would
not sound correct when fed through a second- or third-order decoder.
This also does not currently implement near-field compensation since near-field
rendering effects are not implemented.