(Note: This commit also modifies UI)
Instead of using signals, use designated callback lists for audio
capture and audio control helpers. Signals aren't suitable here due to
the fact that signals aren't meant for things that happen every frame or
things that happen every time audio/video is received. Also prevents
audio from being allocated every time these functions are called due to
the calldata structure.
Transition sources are implemented by registering a source type as
OBS_SOURCE_TYPE_TRANSITION. They're automatically marked as video
composite sources, and video_render/audio_render callbacks must be set
when registering the source. get_width and get_height callbacks are
unused for these types of sources, as transitions automatically handle
width/height behind the scenes with the transition settings.
In the video_render callback, the helper function
obs_transition_video_render is used to assist in automatically
processing and rendering the audio. A render callback is passed to the
function, which in turn passes to/from textures that are automatically
rendered in the back-end.
Similarly, in the audio_render callback, the helper function
obs_transition_audio_render is used to assist in automatically
processing and rendering the audio. Two mix callbacks are used to
handle how the source/destination sources are mixed together. To ensure
the best possible quality, audio processing is per-sample.
Transitions can be set to automatically resize, or they can be set to
have a fixed size. Sources within transitions can be made to scale to
the transition size (with or without aspect ratio), or to not scale
unless they're bigger than the transition. They can have a specific
alignment within the transition, or they just default to top-left.
These features are implemented for the purpose of extending transitions
to also act as "switch" sources later, where you can switch to/from two
different sources using the transition animation.
Planned (but not yet implemented and lower priority) features:
- "Switch" transitions which allow the ability to switch back and forth
between two sources with a transitioning animation without discarding
the references
- Easing options to allow the option to transition with a bezier or
custom curve
- Manual transitioning to allow the front-end/user to manually control
the transition offset
Prevents a mutual lock with the scene mutex and graphics mutex. In
libobs/obs-video.c, the graphics mutex could be locked first, then the
scene mutexes second, while in the UI thread, the scene mutexes could be
locked first, then when a scene item is being destroyed, a source could
be destroyed, and sometimes sources would lock the graphics mutex
second.
A possible additional solution is to defer source destroys to the video
thread.
(Note: test and UI are also modified by this commit)
API Changed (removed "enum obs_source_type type" parameter):
-------------------------
obs_source_get_display_name
obs_source_create
obs_get_source_output_flags
obs_get_source_defaults
obs_get_source_properties
Removes the "type" parameter from these functions. The "type" parameter
really doesn't serve much of a purpose being a parameter in any of these
cases, the type is just to indicate what it's used for.
This buffers scene item visibility actions so that if
obs_sceneitem_set_visible to true or false, that it will ensure that the
action is mapped to the exact sample point time in which
obs_sceneitem_set_visible is called. Mapping to the exact sample point
isn't necessary, but it's a nice thing to have.
The new audio subsystem fixes two issues:
- First Primary issue it fixes is the ability for parent sources to
intercept the audio of child sources, and do custom processing on
them. The main reason for this was the ability to do custom
cross-fading in transitions, but it's also useful for things such as
side-chain effects, applying audio effects to entire scenes, applying
scene-specific audio filters on sub-sources, and other such
possibilities.
- The secondary issue that needed fixing was audio buffering.
Previously, audio buffering was always a fixed buffer size, so it
would always have exactly a certain number of milliseconds of audio
buffering (and thus output delay). Instead, it now dynamically
increases audio buffering only as necessary, minimizing output delay,
and removing the need for users to have to worry about an audio
buffering setting.
The new design makes it so that audio from the leaves of the scene graph
flow to the root nodes, and can be intercepted by parent sources. Each
audio source handles its own buffering, and each audio tick a specific
number of audio frames are popped from the front of the circular buffer
on each audio source. Composite sources (such as scenes) can access the
audio for child sources and do custom processing or mixing on that
audio. Composite sources use the audio_render callback of sources to do
synchronous or deferred audio processing per audio tick. Things like
scenes now mix audio from their sub-sources.
(Note: This commit breaks libobs compilation. Skip if bisecting)
Adds a "composite" source type which is used for sources that composite
one or more sub-sources. The audio_render callback is called for
composite sources to allow those types of sources to do custom
processing of the audio of its sub-sources.
(Note: This commit breaks libobs compilation. Skip if bisecting)
This variable is somewhat redundant. Volume is already known/accessible
to front-ends.
(Note: This commit breaks libobs compilation. Skip if bisecting)
Removes audio lines and stores the circular buffer for the audio on the
source itself.
(Note: This commit breaks libobs compilation. Skip if bisecting)
The mixers that a source was assigned to were originally stored in the
audio line. This will store it in the sources themselves instead.
(Note: This commit breaks libobs compilation. Skip if bisecting)
Uses a callback and allows the caller to mix audio. Additionally,
allows the callback to return audio later, allowing it to buffer as much
as it needs.
The skipped frame count (dropped frames due to encoding being
overloaded) would erroneously include lagged frames (dropped frames due
to render stalls). This will make diagnosing actual issues a user might
be having a bit easier.
Problem:
When an output is started with encoders that have already been started
by another output, and it starts in between the window in between where
the first audio packets start coming in and where the first video packet
comes in, the output would get audio packets with timestamps potentially
much later than the first video frame when the first video frame finally
comes in. The audio/video encoders will almost always have a differing
delay.
Solution:
Detect that starting window, and if within that starting window, wait
for a new keyframe from video instead of trying to sync up video.
Additional Notes:
In these cases where an output starts with already-active encoders, this
patch also reduces the potential sync offset between the first video
frame and the first audio frame. Before, it would sync the first video
frame with the first audio frames right after that, but now it syncs
with the closest audio frame in the interleaved array, which halves the
potential sync difference between the first video frame and the first
audio frame of already-active encoders. (So now the potential sync
difference is roughly 11.6 milliseconds at 44.1khz audio, where it was
23.2 before)
Ensures that the packet dts_usec vals which are generated for
syncing/interleaving use the proper offset relative to where they're
supposed to be starting from. The negative DTS of a first video packet
could potentially have been applied twice due to this.
Allows the ability to use fixed stack memory to construct a calldata_t
structure rather than having to allocate each time. This is fine to do
for certain signals where the calldata never goes above a specific size.
If by some chance the size is insufficient, it will output a log
message.
Originally this was programmed to call the recursive height/width
functions if the source type was an input with the intention of not
calling it on filters, but instead of doing that just program it to do
just that: only call the recursive height/width functions if it's not a
filter.
This was originally used for calculating audio volume if transitions
were active, but transitions won't work that way so tracking the active
transitions is no longer needed.