AL_LOKI_quadriphonic is left alone since that is what the extension is called
and what code expects. All other instances have been fixed for consistency.
Turns out the C version of the cubic resampler is just slightly faster than
even the SSE3 version of the FIR4 resampler. This is likely due to not using a
64KB random-access lookup table along with unaligned loads, both offseting the
gains from SSE.
This improves the transition width, allowing more of the higher frequencies
remain audible. It would be preferrable to have an upper limit of 32 points
instead of 48, to reduce the overall table size and the CPU cost for down-
sampling.
Command line parameters and filenames are now unicode-aware (the .def files
should be UTF-8 encoded, if they contain any non-ASCII-7 characters). Unicode
characters might not display correctly in the console, but it should process
them correctly.
Partially reverts 3f3a3ac4f1d069542ca2399a8b5e63d9d1a4df3b. The l*2 + 1 is
correct when you want an odd number of sample points, which avoids an
unnecessary phase offset in the fitler. However, the rounding is still needed
to calculating the left offset (l), or else the transition width can increase
with an odd-numbered order.
This generates the filters using the proper size and scale. The 'a' divisor
should represent the +/- sample range (and thus be a whole number), with the
number of sample points being double that. Increasing the filter size to a
multiple of 4 (for SIMD) can be done by padding in 0s afterward.
Despite the claim that it was an 11th order filter, the transition width was
generated by specifying 12th order. A 12th order filter would need 14 sample
points rather than the 12 it had.