This avoids an inherent delay from the effect, at the cost of higher CPU use.
Having a customizable user-specified delay (with said user ensuring a properly
trimmed impulse response) could help alleviate the cost since once the delay
exceeds the segment size, the initial FIR filter could be skipped.