Originally when the audio_submix function was created, it used all mixer
tracks, but at a certain point that was removed because it can only use
the first track, so some older code was unintentionally left over,
causing the same code to be executed 6 times mistakenly. This cleans
that up by removing the unnecessary function parameter and for loop.
When an unpause occurs, it takes an audio segment and splits it at the
exact point corresponding to the pause timestamp, and then it's supposed
to only send the ending part of the split. However, the audio pointers
were not being incremented, therefore it was sending the front of the
audio segment to instead of the back of the audio segment by mistake.
When pause has been activated, the video_pause_check() function is used
when receiving raw frames in order to filter out frames that are in the
pause window, that way they aren't sent to the encoder or output.
However, when pause was enabled, it was unintentionally filtering out
some frames before the specified starting timestamp as well, causing
extra video data to get cut out prematurely. This fixes that issue.
When a pause/unpause occurs, a timestamp is set and the actual
pause/unpause does not occur until the output/encoders reach the
specified timestamps. Do not allow pausing/unpausing unless that point
has been reached with all encoders of an encoded output or the output
itself when using a raw output.
This fixes a bug where pause data could get corrupted if
pausing/unpausing too fast, because the audio/video encoders aren't
necessarily synchronized and although one encoder may have unpaused, the
other encoder(s) may not have yet. Checking all encoders first before
allowing a pause/unpause ensures that doesn't occur.
Audio latency can get really low, and if it's low enough, the timestamp
can be passed by the audio subsystem before it's had a chance to pause
with it. So instead, make the pause have a little bit of extra delay to
ensure that doesn't occur.
This fixes a race condition where the audio/video backends/threads may
start using sources before their obs_source_info::create function has
been called.
Also increase weight precision by premultiplying UV in VS.
Intel HD Graphics 530, Intel GPA, SetStablePowerState
256x224 -> 1323x1080: 1221 us -> 1020 us
Adds the "audio_line" internal source type as a bare source type for the
sole purpose of outputting audio, and the obs_source_info::audio_mix
callback which allows mixing of those audio lines, which is then treated
as normal audio for the source. Audio line objects should be added as
sub-sources when multiple audio lines from a single source are needed,
then mixed together with the audio_mix callback.
The difference between the new obs_source_info::audio_mix callback and
obs_source_info::audio_render is that obs_source_info::audio_mix (along
with the audio_line source) are only one track, and it outputs audio to
the source automatically via obs_source_output_audio() when the call
completes. This allows the mixed audio to be treated like a normal
source's audio, in that you can filter it, change its volume, or monitor
it.
This change was necessary because the CEF (used with the browser source)
outputs multiple audio streams at once to a single browser source, so
it's the program's responsibility to mix those streams together itself.
Fixes an issue where the browser source settings will continually reset
pre-24. Note that this is not 23.2.2, but the version is being
temporarily updated in order to fix the issue for the release candidate
build.
When texel samples are not exactly on texel centers, weight calculations
will involve a divide by a number very close to zero, resulting in
precision issues. Restore normalization of weights to compensate.
'obs_properties_apply_settings' and 'obs_properties_remove_by_name'
would incorrectly ignore property groups and not call the callbacks
or remove the property, resulting in quite glitchy UI.
This fix works by splitting the internal logic of ...apply_settings
into an extra function to call, while keeping the API the same. The
change to ...remove_by_name is to simply recursively going into the
group content with the same function call.
The shaders to unpack YUV information from the same texture were rather
complicated. Breaking them up into separate textures makes the shaders
much simpler, and we can remove the PRECISION_OFFSET hack.
Performance also gets a nice boost on Intel for planar textures.
Intel GPA, SetStablePowerState, Intel HD Graphics 530, 1920x1080
UYVY: 473 us -> 457 us
YUY2: 492 us -> 422 us
YVYU: 491 us -> 441 us
I420: 1637 us -> 505 us
I422: 1644 us -> 482 us
I444: 1653 us -> 504 us
NV12: 1656 us -> 369 us
Y800 (limited): 270 us -> 277 us
Y800 (full): 263 us -> 289 us
RGB (limited): 341 us -> 411 us
BGR3 (limited): 512 us -> 509 us
BGR3 (full): 527 us -> 534 us
The video format is not updated if switching between cache-compatible
formats, e.g. YUY2 and YVYU, resulting in the wrong conversion technique
being used. This change ensures the format is always up-to-date.
This change only wraps the functionality. I have rough code to exercise
the the query functionality, but that part is not really clean enough to
submit.
The shaders to pack YUV information into the same texture were rather
complicated and suffering precision issues. Breaking them up into
separate textures makes the shaders much simpler and avoids having to
compute large integer offsets. Unfortunately, the code to handle
multiple textures is not as pleasant, but at least the NV12 rendering
path is no longer separate.
In addition, write chroma samples to "standard" offsets. For I444,
there's no difference, but I420/NV12 formats now have chroma shifted to
the left as 4:2:0 is shown in the H.264 specification.
Intel GPA, SetStablePowerState, Intel HD Graphics 530
Expect speed incrase:
I420: 844 us -> 493 us (254 us + 190 us + 274 us)
I444: 837 us -> 747 us (258 us + 276 us + 272 us)
NV12: 450 us -> 368 us (319 us + 168 us)
Expect no change:
NV12 (HW): 580 (481 us + 166 us) us -> 588 us (468 us + 247 us)
RGB: 359 us -> 387 us
Fixes https://obsproject.com/mantis/view.php?id=624
Fixes https://obsproject.com/mantis/view.php?id=1512
Use bilinear filtering to reduce 36 taps to 25 for the regular path.
This works because the middle weights are always between 0 and 1,
allowing texture coordinates to be placed strategically to sample
correct ratios. I'm not sure about the undistort path, so I've left that
alone.
Also remove scaling added in #526, after which weight normalization is
unnecessary. If we want to use or invent an algorithm with alternate
downscaling properties, that's fine, but I don't think we should change
Lanczos scaling to mean something it's not. The scale implementation was
also seen not working when applied directly to scene items because of
assumptions made about the projection matrix.
Intel GPA, SetStablePowerState, Intel HD Graphics 530, D3D11
644x478 -> 1323x1080: 3890 us -> 3401 us
1920x1080 -> 1280x720: 2555 us -> 2261 us