The "clamped" video time is the system time per video frame that is
closest to the current system time, but always divisible by the frame
interval. For example, if the last frame system timestamp was 1600 and
the new frame is 2500, but the frame interval is 800, then the
"clamped" video time is 2400.
This clamped value is useful to get the relative system time without any
jitter.
Implements exponential backoff for consecutive reconnects, which is
useful to prevent too many connections from trying to reconnect back to
a service at once over a short period of time in the case of potential
service downtime. Exponential backoff causes each subsequent reconnect
attempt to double its timeout duration.
Allows the ability to hint at encoders what format should be used.
This is particularly useful if libobs is currently operating in planar
4:4:4, but you want to force an encoder used for streaming to convert to
NV12 to prevent streaming issues.
The obs_source::async_reset_texture variable can cause a data race
between threads to occur because it could be set to true in one thread
then changed back to false in another thread. This could cause the
async texture to not update its size when it's supposed to, which can
cause a crash or corruption when copying data from a frame of a
differing size.
The solution to this is to:
- Delete the async_reset_texture variable, and make the
set_async_texture_size function change the texture size if the
async_width, async_height, or async_format variables differ from the
frame's width/height/format. Those variables are then only ever set
in the libobs graphics thread.
- Make the cache_video function use separate variables from other
functions to detect a change in size (due to the fact that the texture
size should only be resized in the libobs graphics thread). These
variables are async_cache_width, async_cache_height, and
async_cache_format, which are only be set in the thread that calls
obs_source_output_video.
How to replicate the data race:
- On OSX, use window capture on a textedit window, then continually
resize the textedit window.
This fixes an issue where cache frames would not free at all after
having been allocated with no upper limit on the cached frame size. If
cached frames go unused for a specific period of time, they are
deallocated and removed from the cache.
This is preferable to having an upper cache limit due to the potential
for async delay filtering.
Async frames are only swapping when rendering, or when not visible.
This is a flawed design due to the fact that there are certain
circumstances where the source is neither visible nor currently
rendering.
This is what caused a memory leak when scene items were marked as
invisible, because if a source has an async child source and decides not
to render that source for whatever reason, the child source would not
process the async frames at all, and the cache would just grow.
To fix this, simply moving the async frame cycle to tick fixes the issue
due to the fact that tick is always called regardless of circumstance.
obs_source_process_filter tried to do everything in a single function,
but the problem is that effect parameters would not properly be
accounted for due to the way it internally draws, therefore it was
necessary to split the functions in to two, you first call
obs_source_process_filter_begin, then you set your effect parameters,
then you finally call obs_source_process_filter_end. This ensures that
when the filter is drawn, that the effect parameters are set.
For the show/hide and activate/deactivate callbacks, schedule these
callbacks to only be called from within the video thread rather than in
a separate thread. This ensures that any potential graphics activity
that occurs within them is kept in the same thread.
API changed:
--------------------------
void obs_output_set_audio_encoder(
obs_output_t *output,
obs_encoder_t *encoder);
obs_encoder_t *obs_output_get_audio_encoder(
const obs_output_t *output);
obs_encoder_t *obs_audio_encoder_create(
const char *id,
const char *name,
obs_data_t *settings);
Changed to:
--------------------------
/* 'idx' specifies the track index of the output */
void obs_output_set_audio_encoder(
obs_output_t *output,
obs_encoder_t *encoder,
size_t idx);
/* 'idx' specifies the track index of the output */
obs_encoder_t *obs_output_get_audio_encoder(
const obs_output_t *output,
size_t idx);
/* 'mixer_idx' specifies the mixer index to capture audio from */
obs_encoder_t *obs_audio_encoder_create(
const char *id,
const char *name,
obs_data_t *settings,
size_t mixer_idx);
Overview
--------------------------
This feature allows multiple audio mixers to be used at a time. This
capability was able to be added with surprisingly very little extra
overhead. Audio will not be mixed unless it's assigned to a specific
mixer, and mixers will not mix unless they have an active mix
connection.
Mostly this will be useful for being able to separate out specific audio
for recording versus streaming, but will also be useful for certain
streaming services that support multiple audio streams via RTMP.
I didn't want to use a variable amount of mixers due to the desire to
reduce heap allocations, so currently I set the limit to 4 simultaneous
mixers; this number can be increased later if needed, but honestly I
feel like it's just the right number to use.
Sources:
Sources can now specify which audio mixers their audio is mixed to; this
can be a single mixer or multiple mixers at a time. The
obs_source_set_audio_mixers function sets the audio mixer which an audio
source applies to. For example, 0xF would mean that the source applies
to all four mixers.
Audio Encoders:
Audio encoders now must specify which specific audio mixer they use when
they encode audio data.
Outputs:
Outputs that use encoders can now support multiple audio tracks at once
if they have the OBS_OUTPUT_MULTI_TRACK capability flag set. This is
mostly only useful for certain types of RTMP transmissions, though may
be useful for file formats that support multiple audio tracks as well
later on.
The temporary unoptimized code we were using before just completely
allocated a new copy of each frame every single time a new async frame
was output by the source plugin. This just creates a cache of frames as
needed for the current format/width/height to minimize the allocation
and deallocation. If new frames come in that are of a different
format/width/height, it'll just clear the cache. This is a fairly
important optimization.
all the async video related stuff usually started with async_*, and
there were two that didn't. So I just renamed them so they have the
same naming convention
If an async video source stops video for whatever reason, it would get
stuck on the last frame that was played. This was particularly awkward
when I wanted to give the user the ability to deactivate a source such
as a webcam because it would get stuck on the last frame.
Previously, the design for the interaction between the encoder thread
and the graphics thread was that the encoder thread would signal to the
graphics thread when to start drawing each frame. The original idea
behind this was to prevent mutually cascading stalls of encoding or
graphics rendering (i.e., if rendering took too long, then encoding
would have to catch up, then rendering would have to catch up again, and
so on, cascading upon each other). The ultimate goal was to prevent
encoding from impacting graphics and vise versa.
However, eventually it was realized that there were some fundamental
flaws with this design.
1. Stray frame duplication. You could not guarantee that a frame would
render on time, so sometimes frames would unintentionally be lost if
there was any sort of minor hiccup or if the thread took too long to
be scheduled I'm guessing.
2. Frame timing in the rendering thread was less accurate. The only
place where frame timing was accurate was in the encoder thread, and
the graphics thread was at the whim of thread scheduling. On higher
end computers it was typically fine, but it was just generally not
guaranteed that a frame would be rendered when it was supposed to be
rendered.
So the solution (originally proposed by r1ch and paibox) is to instead
keep the encoding and graphics threads separate as usual, but instead of
the encoder thread controlling the graphics thread, the graphics thread
now controls the encoder thread. The encoder thread keeps a limited
cache of frames, then the graphics thread copies frames in to the cache
and increments a semaphore to schedule the encoder thread to encode that
data.
In the cache, each frame has an encode counter. If the frame cache is
full (e.g., the encoder taking too long to return frames), it will not
cache a new frame, but instead will just increment the counter on the
last frame in the cache to schedule that frame to encode again, ensuring
that frames are on time and reducing CPU usage by lowering video
complexity. If the graphics thread takes too long to render a frame,
then it will add that frame with the count value set to the total amount
of frames that were missed (actual legitimately duplicated frames).
Because the cache gives many frames of breathing room for the encoder to
encode frames, this design helps improve results especially when using
encoding presets that have higher complexity and CPU usage, minimizing
the risk of needlessly skipped or duplicated frames.
I also managed to sneak in what should be a bit of an optimization to
reduce copying of frame data, though how much of an optimization it
ultimately ends up being is debatable.
So to sum it up, this commit increases accuracy of frame timing,
completely removes stray frame duplication, gives better results for
higher complexity encoding presets, and potentially optimizes the frame
pipeline a tiny bit.
In certain circumstances where the output was stopping, and where data
took a long enough time to send (such as when using an encoding preset
that causes high CPU usage), the output would sometimes still send data
even after it was stopped, typically causing the output to crash.
This changes the way source volume handles transitioning between being
active and inactive states.
The previous way that transitioning handled volume was that it set the
presentation volume of the source and all of its sub-sources to 0.0 if
the source was inactive, and 1.0 if active. Transition sources would
then also set the presentation volume for sub-sources to whatever their
transitioning volume was. However, the problem with this is that the
design didn't take in to account if the source or its sub-sources were
active anywhere else, so because of that it would break if that ever
happened, and I didn't realize that when I was designing it.
So instead, this completely overhauls the design of handling
transitioning volume. Each frame, it'll go through all sources and
check whether they're active or inactive and set the base volume
accordingly. If transitions are currently active, it will actually walk
the active source tree and check whether the source is in a
transitioning state somewhere.
- If the source is a sub-source of a transition, and it's not active
outside of the transition, then the transition will control the
volume of the source.
- If the source is a sub-source of a transition, but it's also active
outside of the transition, it'll defer to whichever is louder.
This also adds a new callback to the obs_source_info structure for
transition sources, get_transition_volume, which is called to get the
transitioning volume of a sub-source.
The reason to keep a reference counter for transitions is due to an
optimization I'm planning on when calculating transition volumes. I'm
planning on walking the source tree to be able to calculate the current
base volume of a source, but *only* if there are transitions active,
because the only time that the volume can be anything other than 1.0
or 0.0 is when there are active transitions, which may change the base
volume of a source.
Changed the design from using obs_source::enum_refs to just simply
preventing infinite source recursion in general, rather than allowing it
through the enum_refs variable. obs_source_add_child has been changed
so that it now returns a boolean, and if the function fails, it means
that the child cannot be added due to that potential recursion.
This adds bicubic and lanczos scaling capability to libobs to improve
scaling quality and sharpness when the output resolution has to be
scaled relative to the base resolution. Bilinear is also available,
although bilinear has rather poor quality and causes scaling to appear
blurry.
If the output resolution is close to the base resolution, then bilinear
is used instead as an optimization, as there's no need to use these
shaders if scaling is not in use.
The Bicubic and Lanczos effects are also exposed via exported function
to allow the ability to use those shaders in plugin modules if desired.
The API change adds a variable 'scale_type' to the obs_video_info
structure that allows the user interface to choose what type of scaling
filter should be used.
This was an important change because we were originally using an
hard-coded 709/partial range color matrix for the output, which was
causing problems for people wanting to use different formats or color
spaces. This will now automatically generate the color matrix depending
on the format, color space, and range, or use an identity matrix if the
video format is RGB instead of YUV.
This moves the 'flags' variable from the obs_source_frame structure to
the obs_source structure, and allows user flags to be set for a specific
source. Having it set on the obs_source_frame structure didn't make
much sense.
OBS_SOURCE_UNBUFFERED makes it so that the source does not buffer its
async video output in order to try to play it on time. In other words,
frames are played as soon as possible after being received.
Useful when you want a source to play back as quickly as possible
(webcams, certain types of capture devices)
This bug would happen if audio packets started being received before
video packets. It would erroneously cause audio packets to be
completely thrown away, and in certain cases would cause audio and video
to start way out of sync.
My original intention was "don't accept audio until video has started",
but instead mistakenly had the effect of "don't start audio until a
video packet has been received". This was originally was intended as a
way to handle outputs hooking in to active encoders and compensating
their existing timestamp information.
However, this made me realize that there was a major flaw in the design
for handling this, so I basically rewrote the entire thing.
Now, it does the following steps when inserting packets:
- Insert packets in to the interleaved packet array
- When both audio/video packets are received, prune packets up until the
point in which both audio/video start at the same time
- Resort the interleaved packet array
I have tested this code extensively and it appears to be working well,
regardless of whether or not the encoders were already active with
another output.
At the start of each render loop, it would get the timestamp, and then
it would then assign that timestamp to whatever frame was downloaded.
However, the frame that was downloaded was usually occurred a number of
frames ago, so it would assign the wrong timestamp value to that frame.
This fixes that issue by storing the timestamps in a circular buffer.
This Fixes a minor flaw with the API where data had to always be mutable
to be usable by the API.
Functions that do not modify the fundamental underlying data of a
structure should be marked as constant, both for safety and to signify
that the parameter is input only and will not be modified by the
function using it.
Typedef pointers are unsafe. If you do:
typedef struct bla *bla_t;
then you cannot use it as a constant, such as: const bla_t, because
that constant will be to the pointer itself rather than to the
underlying data. I admit this was a fundamental mistake that must
be corrected.
All typedefs that were pointer types will now have their pointers
removed from the type itself, and the pointers will be used when they
are actually used as variables/parameters/returns instead.
This does not break ABI though, which is pretty nice.
This prevents multiple needless calls to obs_source_get_frame and other
functions. If the texture has already been processed, then just render
it as-is in any subsequent calls to obs_source_video_render.
This is actually unnecessary now that there's a hard limit on the
maximum offset in which audio can be inserted.
This also assumes too much about the audio; it assumes audio is always
on, where as with some devices (such as the elgato) audio is not on
until the stream starts, and when the video has already incremented the
counter.
This makes it easier to do two things:
1.) Get the skipped frames count relative to each specific output
2.) Make it so that getting the 'current' log will always contain
information about skipped frames. Before, you'd have to force the
user to restart the program and get the last log, which was really
annoying when you just wanted to see how the encoders were
performing.
API functions added:
-----------------------------------------------
obs_output_set_preferred_size
obs_output_get_width
obs_output_get_height
obs_encoder_set_scaled_size
obs_encoder_get_width
obs_encoder_get_height
These functions allow for easier means of setting a custom resolution on
an output or encoder.
If an output uses an encoder and you set the preferred width/height
using the output, then the output will attempt to set the scaled
width/height for the encoder it's currently using.
Outputs and encoders now should use these functions to determine the
width/height of the raw frame data instead of using the video-io
functions.