On outputs that use already-active video/audio encoder, the audio
pruning to sync up audio packets with video packets doesn't always get
called (for example if the video pruning function was called). Always
prune excess starting audio packets.
If audio buffering is very high, the audio packets built up in the
interleaved buffer would be significantly before the first video packet,
causing the offset between the starting video/audio packet pairs to be
significantly off, leading to desync.
This issue was not spotted until recently because it only happens when
streaming/recording with same encoders while audio buffering is very
high.
When starting a multi-track output, attempt to pair the video encoder
with one of the audio encoders to ensure that the video and audio
encoders start as close together in time as possible. This ensures the
best possible audio/video syncing point when using multi-track audio
output.
When using multi-track audio, encoders cannot be paired like they can
when only using a single audio track with video, so it has to choose the
best point in the interleaved buffer as the "starting point", and if the
encoders start up at different times, it has to prune that data and wait
to start the output on the next video keyframe. When the audio encoders
started up, there was the case where the encoders would take some time
to load, and it would cause the pruning code to wait for the next
keyframe to ensure startup syncing.
Starting the audio encoders before starting the video encoder should
reduce the possibility of that happening in a multi-track scenario.
In a multi-track scenario it was not taking in to consideration the
possibility of secondary audio tracks, which could have caused desync on
some of the audio tracks.
Problem:
When an output is started with encoders that have already been started
by another output, and it starts in between the window in between where
the first audio packets start coming in and where the first video packet
comes in, the output would get audio packets with timestamps potentially
much later than the first video frame when the first video frame finally
comes in. The audio/video encoders will almost always have a differing
delay.
Solution:
Detect that starting window, and if within that starting window, wait
for a new keyframe from video instead of trying to sync up video.
Additional Notes:
In these cases where an output starts with already-active encoders, this
patch also reduces the potential sync offset between the first video
frame and the first audio frame. Before, it would sync the first video
frame with the first audio frames right after that, but now it syncs
with the closest audio frame in the interleaved array, which halves the
potential sync difference between the first video frame and the first
audio frame of already-active encoders. (So now the potential sync
difference is roughly 11.6 milliseconds at 44.1khz audio, where it was
23.2 before)
API changed from:
obs_source_info::get_name(void)
obs_output_info::get_name(void)
obs_encoder_info::get_name(void)
obs_service_info::get_name(void)
API changed to:
obs_source_info::get_name(void *type_data)
obs_output_info::get_name(void *type_data)
obs_encoder_info::get_name(void *type_data)
obs_service_info::get_name(void *type_data)
This allows the type data to be used when getting the name of the
object (useful for plugin wrappers primarily).
NOTE: Though a parameter was added, this is backward-compatible with
older plugins due to calling convention. The new parameter will simply
be ignored by older plugins, and the stack (if used) will be cleaned up
by the caller.
Allows objects to be created regardless of whether the actual id exists
or not. This is a precaution that preserves objects/settings if for
some reason the id was removed for whatever reason (plugin removed, or
hardware encoder that disappeared). This was already added for sources,
but really needs to be added for other libobs objects as well: outputs,
encoders, services.
This fixes the issue when an output cancels reconnecting, reconnect is
left at true, causing obs_output_active to always return true even
though reconnecting has actually been canceled.
This feature allows a user to delay an output (as long as the output
itself supports it). Needless to say this intended for live streams,
where users may want to delay their streams to prevent stream sniping,
cheating, and other such things.
The design this time was a bit more elaborate, but still simple in
design: the user can now schedule stops/starts without having to wait
for the stream itself to stop before being able to take any action.
Optionally, they can also forcibly stop stream (and delay) in case
something happens which they might not want to be streamed.
Additionally, a new option was added to preserve stream cutoff point on
disconnections/reconnections, so that if you get disconnected while
streaming, when it reconnects, it will reconnect right at the point
where it left off. This will probably be quite useful for a number of
applications in addition to regular delay, such as setting the delay to
1 second and then using this feature to minimize, for example, a
critical stream such as a tournament stream from getting any of its
stream data cut off. However, using this feature will of course cause
the stream data to buffer and increase delay (and memory usage) while
it's in the process of reconnecting.
Implements exponential backoff for consecutive reconnects, which is
useful to prevent too many connections from trying to reconnect back to
a service at once over a short period of time in the case of potential
service downtime. Exponential backoff causes each subsequent reconnect
attempt to double its timeout duration.
This optionally allows the front-end to know what the current timeout
value in seconds is set to for the reconnection without having to call
an extra API function to find that out.
When using multiple video encoders together with a single audio encoder,
the audio wouldn't be in sync.
The reason why this occurred is because the dts_usec variable of the
encoder packet (which is based on system time) would always be reset to
a value based upon the dts (which is not guaranteed to be based on
system time) in the apply_interleaved_packet_offset function. This
would then in turn cause it to miscalculate the starting audio/video
offsets, which are required to calculate sync.
So instead of calling that function unnecessarily, separate the check
for whether audio/video has been received in to a new function, and only
start applying the interleaved offsets after audio and video have
actually started up and the starting offsets have been calculated.
API changed:
--------------------------
void obs_output_set_audio_encoder(
obs_output_t *output,
obs_encoder_t *encoder);
obs_encoder_t *obs_output_get_audio_encoder(
const obs_output_t *output);
obs_encoder_t *obs_audio_encoder_create(
const char *id,
const char *name,
obs_data_t *settings);
Changed to:
--------------------------
/* 'idx' specifies the track index of the output */
void obs_output_set_audio_encoder(
obs_output_t *output,
obs_encoder_t *encoder,
size_t idx);
/* 'idx' specifies the track index of the output */
obs_encoder_t *obs_output_get_audio_encoder(
const obs_output_t *output,
size_t idx);
/* 'mixer_idx' specifies the mixer index to capture audio from */
obs_encoder_t *obs_audio_encoder_create(
const char *id,
const char *name,
obs_data_t *settings,
size_t mixer_idx);
Overview
--------------------------
This feature allows multiple audio mixers to be used at a time. This
capability was able to be added with surprisingly very little extra
overhead. Audio will not be mixed unless it's assigned to a specific
mixer, and mixers will not mix unless they have an active mix
connection.
Mostly this will be useful for being able to separate out specific audio
for recording versus streaming, but will also be useful for certain
streaming services that support multiple audio streams via RTMP.
I didn't want to use a variable amount of mixers due to the desire to
reduce heap allocations, so currently I set the limit to 4 simultaneous
mixers; this number can be increased later if needed, but honestly I
feel like it's just the right number to use.
Sources:
Sources can now specify which audio mixers their audio is mixed to; this
can be a single mixer or multiple mixers at a time. The
obs_source_set_audio_mixers function sets the audio mixer which an audio
source applies to. For example, 0xF would mean that the source applies
to all four mixers.
Audio Encoders:
Audio encoders now must specify which specific audio mixer they use when
they encode audio data.
Outputs:
Outputs that use encoders can now support multiple audio tracks at once
if they have the OBS_OUTPUT_MULTI_TRACK capability flag set. This is
mostly only useful for certain types of RTMP transmissions, though may
be useful for file formats that support multiple audio tracks as well
later on.
In certain circumstances where the output was stopping, and where data
took a long enough time to send (such as when using an encoding preset
that causes high CPU usage), the output would sometimes still send data
even after it was stopped, typically causing the output to crash.
Apparently the audio isn't guaranteed to start up past the first video
frame, so it would trigger that assert (which I'm glad I put in). I
didn't originally have this happen when I was testing because my audio
buffering was not at the default value and didn't trigger it to occur.
A blunder on my part, and once again a fine example of how you should
never make assumptions about possible code path.
This bug would happen if audio packets started being received before
video packets. It would erroneously cause audio packets to be
completely thrown away, and in certain cases would cause audio and video
to start way out of sync.
My original intention was "don't accept audio until video has started",
but instead mistakenly had the effect of "don't start audio until a
video packet has been received". This was originally was intended as a
way to handle outputs hooking in to active encoders and compensating
their existing timestamp information.
However, this made me realize that there was a major flaw in the design
for handling this, so I basically rewrote the entire thing.
Now, it does the following steps when inserting packets:
- Insert packets in to the interleaved packet array
- When both audio/video packets are received, prune packets up until the
point in which both audio/video start at the same time
- Resort the interleaved packet array
I have tested this code extensively and it appears to be working well,
regardless of whether or not the encoders were already active with
another output.
Apparently I unintentionally typed received_video = false twice instead
of one for video and one for audio.
This fixes a bug where audio would not start up again on an output that
had recently started and then stopped.
When the output sets a new audio/video encoder, it was not properly
removing itself from the previous audio/video encoders it was associated
with. It was erroneously removing itself from the encoder parameter
instead.
This Fixes a minor flaw with the API where data had to always be mutable
to be usable by the API.
Functions that do not modify the fundamental underlying data of a
structure should be marked as constant, both for safety and to signify
that the parameter is input only and will not be modified by the
function using it.
Typedef pointers are unsafe. If you do:
typedef struct bla *bla_t;
then you cannot use it as a constant, such as: const bla_t, because
that constant will be to the pointer itself rather than to the
underlying data. I admit this was a fundamental mistake that must
be corrected.
All typedefs that were pointer types will now have their pointers
removed from the type itself, and the pointers will be used when they
are actually used as variables/parameters/returns instead.
This does not break ABI though, which is pretty nice.
This makes it easier to do two things:
1.) Get the skipped frames count relative to each specific output
2.) Make it so that getting the 'current' log will always contain
information about skipped frames. Before, you'd have to force the
user to restart the program and get the last log, which was really
annoying when you just wanted to see how the encoders were
performing.
API functions added:
-----------------------------------------------
obs_output_set_preferred_size
obs_output_get_width
obs_output_get_height
obs_encoder_set_scaled_size
obs_encoder_get_width
obs_encoder_get_height
These functions allow for easier means of setting a custom resolution on
an output or encoder.
If an output uses an encoder and you set the preferred width/height
using the output, then the output will attempt to set the scaled
width/height for the encoder it's currently using.
Outputs and encoders now should use these functions to determine the
width/height of the raw frame data instead of using the video-io
functions.
Instead of having functions like obs_signal_handler() that can fail to
properly specify their actual intent in the name (does it signal a
handler, or does it return a signal handler?), always prefix functions
that are meant to get information with 'get' to make its functionality
more explicit.
Previous names: New names:
-----------------------------------------------------------
obs_audio obs_get_audio
obs_video obs_get_video
obs_signalhandler obs_get_signal_handler
obs_prochandler obs_get_proc_handler
obs_source_signalhandler obs_source_get_signal_handler
obs_source_prochandler obs_source_get_proc_handler
obs_output_signalhandler obs_output_get_signal_handler
obs_output_prochandler obs_output_get_proc_handler
obs_service_signalhandler obs_service_get_signal_handler
obs_service_prochandler obs_service_get_proc_handler
With the recent change to module handling by BtbN, I felt that having
this information might be useful in case someone is actually using make
install to set up their libraries.
Total bytes, total frames, and frames dropped. Total frames is
generated automatically, but total bytes and total dropped frames are
returned via callbacks.
Before it would assign the encoder/media callbacks directly to the
output's callbacks, so instead of doing that, it now goes through
intermediary functions for the sake of counting the frames.