Rather than applying the HF scale to the IRs necessitating them to be truncated
along with increasing the IR size, it can be applied to the input signal for
the same results. Consequently, the IR size can be notably shortened while
avoiding the extra truncation. In its place, the delayed reversed all-pass
technique can still be used on the input for maintaining phase when applying
the bandsplit/hfscalar filter to the input signal.
Using a mask helps the compiler recognize that the leftover (any remaining non-
multiple-of-4) and realignment loops will only have 3 iterations at most, which
it can unroll or otherwise make more meaningful optimizations for. Previously
it would try to vectorize and partially unroll the loops, which is wasteful
when there would never be enough to vectorize.
All the methods used should be compliant with C++14 constexpr rules. However,
the number of scales and phases cause GenerateBSincCoeffs to reach the allowed
step limit, preventing full compile-time generation. It's not a terribly big
deal, it'll generate them very quickly when loading, but it does prevent using
shared read-only memory pages.
It notably simplifies things to mix HRTF sources into an accumulation buffer
together, which the Dry buffer's Ambisonic-to-HRTF decode is then added to,
before being mixed to the Real output.
It's somewhat ambiguous what they mean. Sometimes acting as a pointer, other
times having weird behavior. Pointer-to-function types are explicitly defined
as such, whereas uses of these tend to be as references (never null and not
changeable).
This puts the base coefficients and the phase deltas next to each other. This
improves caching, as the base and phase deltas are always used together while
the scales are only used for the non-fast versions.
This takes advantage of the fact than when increment <= 1 (when not down-
sampling), the scale factor is always 0. As a result, the scale and scale-phase
deltas never contribute to the filtered output. Removing those multiply+add
operations cuts half of the work done by the inner loop.
Sounds that do need to down-sample (when played with a high pitch, or is 48khz
on 44.1khz output, for example), still go through the normal bsinc process.