1812 Commits

Author SHA1 Message Date
jpark37
b0e7c5d0d3 libobs: Fix stale format in async frame cache
The video format is not updated if switching between cache-compatible
formats, e.g. YUY2 and YVYU, resulting in the wrong conversion technique
being used. This change ensures the format is always up-to-date.
2019-08-07 06:47:27 -07:00
jp9000
68a5a40df9 libobs, obs-ffmpeg, win-dshow: Fix FFmpeg 4.0 deprecation
Fixes FFmpeg 4.0 deprecation warnings.
2019-07-29 20:34:13 -07:00
Jim
62c7e00d16
Merge pull request #1993 from jpark37/faster-bicubic
Optimize bicubic shader
2019-07-26 00:36:19 -07:00
jp9000
aee84cc743 libobs: Add "monitoring by default" source cap
(This also modifies the UI module)

Adds the ability for a source to monitor by default.  This is mainly
aimed at browser sources, so that they do not stop outputting audio by
default like they used to.
2019-07-26 00:05:14 -07:00
jpark37
2721ac4a85 libobs: Optimize bicubic shader
Use bilinear filtering to reduce 16 taps to 9 for the regular path. This
works because the middle weights are always between 0 and 1, allowing
texture coordinates to be placed strategically to sample correct ratios.
I'm not sure about the undistort path, so I've left that alone.

Also remove weight normalization. I'm not seeing that make even a small
difference.

Intel HD Graphics 530, D3D11
644x478 -> 1323x1080: 1790 us -> 1279 us
1920x1080 -> 1280x720: 1301 us -> 918 us

References:
https://entropymine.com/imageworsener/bicubic/
http://vec3.ca/bicubic-filtering-in-fewer-taps/
http://developer.download.nvidia.com/books/HTML/gpugems/gpugems_ch24.html
2019-07-25 22:21:11 -07:00
jpark37
2f286b81d9 libobs: Default sampler sometimes unset for GL
When mixing sampling with raw loads in a shader, ending a shader with a
load would case the default sampler to become unset for OpenGL. Instead,
initialize with no sampler, and only set if there is a sampler.
2019-07-25 21:54:23 -07:00
jpark37
2c6aac32c8 libobs: Fix benign typo 2019-07-25 21:54:23 -07:00
James Park
37f663a789 libobs: obs-ffmpeg: win-dshow: Planar 4:2:2 video
This format has been seen when using FFmpeg MJPEG decompression.
2019-07-25 20:11:37 -07:00
Jim
173eec0d6e
Merge pull request #1978 from jpark37/defer-yuv-multiply
libobs: Rework RGB to YUV conversion
2019-07-25 16:20:48 -07:00
Jim
37072e6c97
Merge pull request #1932 from Chiitoo/libdir
cmake: Install 'libobs.pc' under the correct 'libdir'
2019-07-22 02:45:22 -07:00
jpark37
2656bf0a90 libobs: Rework RGB to YUV conversion
RGB to YUV converison was previously baked into every scale shader, but
this work has been moved to the YUV packing shaders. The scale shaders
now write RGBA instead. In the case where base and output resolutions
are identical, the render texture is forwarded directly to the YUV pack
step, skipping an entire fullscreen pass.

Intel GPA, SetStablePowerState, Intel HD Graphics 530, NV12

1920x1080, Before:
RGBA -> UYVX: ~321 us
UYVX -> Y: ~480 us
UYVX -> UV: ~127 us

1920x1080, After:
[forward render texture]
RGBA -> Y: ~487 us
RGBA -> UV: ~131 us

1920x1080 -> 1280x720, Before:
RGBA -> UYVX: ~268 us
UYVX -> Y: ~209 us
UYVX -> UV: ~57 us

1920x1080 -> 1280x720, After:
RGBA -> RGBA (rescale): ~268 us
RGBA -> Y: ~210 us
RGBA -> UV: ~58 us
2019-07-22 01:12:35 -07:00
jpark37
e5b004fd48 libobs: Remove YUV transformation on CPU
This code path does not appear to be used. Breakpoint-inspected all four
output formats I420/I444/NV12/RGB, and they are all behaving as they
should.
2019-07-22 01:12:01 -07:00
jpark37
3ea98b8b0d libobs: Improve timing of unbuffered deinterlacing
There are devices like the GV-USB2 that produce frames with smmoth
timestamps at an uneven pace, which causes OBS to stutter because the
unbuffered path is designed to aggressively operate on the latest frame.

We can make the unbuffered path work by making two adjustments:

- Don't discard the current frame until it has elapsed.
- Don't skip frames in the queue until they have elapsed.

The buffered path still has problems with deinterlacing GV-USB2 output,
but the unbuffered path is better anyway.

Testing:

GV-USB2, Unbuffered: Stuttering is gone!
GV-USB2, Buffered: No regression (still broken).
SC-512N1-L/DVI, Unbuffered: No regression (still works).
SC-512N1-L/DVI, Buffered: No regression (still works).
2019-07-20 21:00:59 -07:00
Jim
262a8c62bc
Merge pull request #1981 from jpark37/optimize-backdrop
libobs: UI: Remove DrawBackdrop() to save fullscreen pass
2019-07-20 17:09:08 -07:00
Jim
3f7d4fe1f5
Merge pull request #1975 from jpark37/area-upscale-shader
libobs: obs-filters: Area upscale shader
2019-07-20 17:06:41 -07:00
Jim
ffcfe4c9d9
Merge pull request #1951 from jpark37/audio-buffering
Fix audio buffering for devices like GV-USB2
2019-07-20 17:05:27 -07:00
jpark37
3456ed0644 libobs: UI: Remove DrawBackdrop() to save fullscreen pass
It's a waste of GPU time to do two fullscreen passes to render final mix
previews. Use blend states to simulate the black background of
DrawBackdrop() for the following situations:

- Main preview window (Studio Mode off)
- Studio Mode: Program

This does not effect:

- Studio Mode: Preview (still uses DrawBackdrop)
- Fullscreen Projector (uses GPU clear to black)
- Windowed Projector (uses GPU clear to black)

intel GPA, SetStablePowerState, Intel HD Graphics 530, 1920x1080

Before:
DrawBackdrop: ~529 us
main texture: ~367 us (Cheaper than drawing a black quad?)

After:
[DrawBackdrop optimized away]
main texture: ~383 us
2019-07-18 19:58:29 -07:00
jpark37
85cc7c84bc libobs: obs-filters: Area upscale shader
Add a separate shader for area upscaling to take advantage of bilinear
filtering. Iterating over texels is unnecessary in the upscale case
because a target pixel can only overlap 1 or 2 texels in X and Y
directions. When only overlapping one texel, adjust UVs to sample texel
center to avoid filtering.

Also add "base_dimension" uniform to avoid unnecessary division.

Intel HD Graphics 530, 644x478 -> 1323x1080: ~836 us -> ~232 us
2019-07-17 21:11:18 -07:00
jp9000
d4e236dd03 libobs: Fix formatting 2019-07-13 19:01:48 -07:00
Colin Edwards
c64d82530d
Merge pull request #1960 from Xaymar/patch-get_defaults2
libobs: Call both get_defaults and get_defaults2
2019-07-13 20:40:20 -05:00
Colin Edwards
4ec072075d
Merge pull request #1958 from DDRBoxman/format
Apply clang-format to objective c code
2019-07-12 21:43:55 -05:00
Michael Fabian 'Xaymar' Dirks
3f6bbe2d49 libobs: Call both get_defaults and get_defaults2
Unlike get_properties, there is not reason to not call get_defaults if it is
given in addition to get_defaults2. Additonally this fixes the bug with
'init_encoder' which would only ever call get_defaults, resulting in broken
encoders if those used get_defaults2.
2019-07-13 00:49:18 +02:00
jp9000
bda28b242c libobs: Fix formatting 2019-07-12 11:48:41 -07:00
Jim
36c8090492
Merge pull request #1952 from obsproject/pause
Add the ability to pause and unpause recordings
2019-07-11 19:56:41 -07:00
Colin Edwards
ad85a9fa25 Apply clang-format to objective c code 2019-07-09 13:39:13 -05:00
wang-bin
5b6ee6e66b libobs: Clear module variable in case module reloaded
Closes obsproject/obs-studio#1957
2019-07-09 08:37:43 -07:00
jp9000
153fa6337f libobs: Implement pausing of outputs
This implements pausing of outputs.  To accomplish this, raw audio/video
data is halted to the encoders or raw output.  Pausing is as precisely
timed as possible according to the timing of the obs_output_pause call,
and audio data will be spliced down to the exact audio sample in
accordance to that timing at the start/end marks.

Outputs that support this (outputs used for recording) can set the
OBS_OUTPUT_CAN_PAUSE capability flag.
2019-07-07 16:38:22 -07:00
jp9000
85ca1b6918 libobs: Correct raw output starting audio data
If the audio subsystem was buffered to any extent, the audio of a raw
output would start off at a negative offset, requiring each raw output
to implement a "prepare_audio" function (as seen in the FFmpeg output)
in order to ensure proper synchronization with video.  This did not
apply to encoded outputs because it was already being performed by the
obs-encoder code.
2019-07-07 16:38:21 -07:00
jp9000
70ecbcd5d4 libobs: Add obs_get_frame_interval_ns
Returns the current video frame interval between frames, in nanoseconds.
2019-07-07 16:38:21 -07:00
James Park
da87f08da4 libobs: Buffer-smoothing enhancements
If an audio source does not provide enough data at a steady pace, the
timestamp update does not happen, and buffering increases until it
maxes out. To counteract this, update the timestamp anyway.

Another issue for decoupled audio sources is that timing is not
adjusted for divergence from system time. Making this adjustment is
better for timing stability.

5+ hours of stable audio without any buffering on my GV-USB2 where it
used to add 21ms every 5 mintues or so.

Fixes https://obsproject.com/mantis/view.php?id=1269
2019-07-05 09:13:18 -07:00
jpark37
2ef25ceb85 libobs: Fix format selection
Fix ternary test to use BGRX render targets for YUV to RGB
conversions. The previous behavior may have been fine though since
the shaders fill the alpha channel with 1.0 anyway.
2019-06-27 08:57:41 -05:00
jp9000
f53df7da64 clang-format: Apply formatting
Code submissions have continually suffered from formatting
inconsistencies that constantly have to be addressed.  Using
clang-format simplifies this by making code formatting more consistent,
and allows automation of the code formatting so that maintainers can
focus more on the code itself instead of code formatting.
2019-06-23 23:49:10 -07:00
jp9000
53615ee10f clang-format: Add clang-format files 2019-06-23 01:53:56 -07:00
Jimi Huotari
ab67b39257
cmake: Install 'libobs.pc' under the correct 'libdir'
In 'libobs/CMakeLists.txt', use '${CMAKE_INSTALL_LIBDIR}' instead of
'${CMAKE_INSTALL_PREFIX}/lib', as the latter results into 'libobs.pc'
being installed under '/lib' when '/lib64' would be more appropriate.

In 'libobs/libobs.pc.in', use '@CMAKE_INSTALL_FULL_LIBDIR@' for
'libdir', '@CMAKE_INSTALL_FULL_INCLUDEDIR@' for 'includedir',
and '@CMAKE_INSTALL_PREFIX@' for 'prefix'.

Gentoo-Bug: https://bugs.gentoo.org/644538
2019-06-21 21:27:53 +03:00
James Park
aa22b61e3e libobs: Full-screen triangle format conversions
The cache coherency of rasterization for full-screen passes is better
using an oversized triangle that is clipped rather than two triangles.
Traversal order of rasterization is GPU-specific, but will almost
certainly be better using an undivided primitive.

A smaller benefit is that quads along the diagonal are not evaluated
multiple times, but that's minor in comparison.

Redo format shaders to bypass vertex buffer, and input layout. Add
global shader bool "obs_glsl_compile" to make API-specific decisions,
i.e. handle upside-down UVs. gl_ortho is not needed for format
conversion because the vertex shader does not use ViewProj anymore.

This can be applied to more situations, but start small first.

Testbed full screen passes, Intel HD Graphics 530:
RGBA -> UYVX: 467 -> 439 us, ~6% savings
UYVX -> uv: 295 -> 239 us, ~19% savings
2019-06-18 22:29:07 -07:00
Jim
6a795d52ea
Merge pull request #1894 from Rosuav/lock-unlock-event
libobs/UI: Implement an item_locked event
2019-06-18 20:31:26 -07:00
Jim
ab70bff4b3
Merge pull request #1913 from jpark37/area-shader-optimization
libobs: Area-resampling shader optimizations
2019-06-17 20:40:25 -07:00
Jim
fafda14963
Merge pull request #1906 from jpark37/bgr-three
libobs: linux-v412: obs-ffmpeg: Add packed BGR3 video support
2019-06-15 16:40:44 -07:00
Jim
914fffe137
Merge pull request #1889 from jpark37/warm-render-targets
libobs: Remove unnecessary frame pipelining
2019-06-15 16:13:03 -07:00
Chris Angelico
2fe641b8a4 libobs, UI: Implement item_locked event
Similar to item_visible, this event fires whenever a scene item is
locked or unlocked. This allows the UI and libobs to remain in sync
regarding scene elements' statuses.
2019-06-15 16:09:10 -07:00
Jim
dd607b422f
Merge pull request #1881 from jpark37/lowres-fair-sampling
libobs: Improve low-resolution bilinear sampling
2019-06-15 16:03:02 -07:00
jp9000
02e523c125 libobs: Update version to 23.2.1 2019-06-13 22:28:10 -07:00
James Park
e72eb39e47 libobs: Disable blending when converting sources
This fixes the issue where limited-range RGB sources were being
composited with dirty render targets.
2019-06-12 22:23:51 -07:00
jp9000
d5708d656e libobs: Fix null pointer dereference 2019-06-11 16:26:09 -07:00
jp9000
0d4d7f617c libobs: Update version to 23.2.0 2019-06-10 22:10:21 -07:00
James Park
9f66b90d99 libobs: Area-resampling shader optimizations
Switch for loop to do/while because we know the condition is always
true for the first loop.

Replace int math with float math to play nicely with more GPUs.

Add variables imagesize/targetsize to avoid redundant reciprocals.

Intel GPA results: 1166 -> 836 us
2019-06-03 23:11:23 -07:00
James Park
614025742b libobs: linux-v412: obs-ffmpeg: Add packed BGR3 video support
Someone mentioned this format preserves the most quality for a
particular capture card using V4L2.
2019-05-30 06:05:53 -07:00
James Park
8d6ed988e6 libobs: Remove unnecessary frame pipelining
Remove three instances of unnecessary double-buffering. They are not
needed to avoid stalls, and cause increased memory traffic when
measured on Intel HD 530, presumably because texture data will remain
in cache if sampled immediately after write.

(Note: GPU timings from Intel GPA are volatile.)

NV12, 3 Draws:
RGBA -> UYVX: 628 us -> 543 us
UYVX -> Y: 522 us -> 507 us
UYVX -> UV: 315 us -> 187 us
Total, Duration: 1594 us -> 1153 us
Total, GTI Read Throughput: 25.2 MB -> 15.9 MB
2019-05-24 01:03:21 -07:00
jp9000
3699209ce4 libobs: Pair encoders only when output actually starts
Normally, paired encoders are unpaired when they stop.  However, if the
pairing occurs before the encoders actually start, and the encoders
never actually end up starting, they are never unpaired, and that
pairing stays with them until the next time an output is started up
again.  That in turn can cause an output that uses one of the encoders
but not the other to not function correctly, and neither properly
"start" nor stop because the data is queued continually in the
interleaved packet array.

For example, let's say there are two outputs, two video encoders, and
one audio encoder.  This can be reproduced by using advanced output mode
and making the two outputs use separate video encoders while sharing
track 1's audio encoder.  If you start up the stream output first and it
fails to fully connect for whatever reason (bad server, bad stream key,
etc), then you start up the recording output, the recording output will
appear to be running, but will not stop when you hit "stop recording".
It will stay perpetually on "stopping recording" and will get stuck that
way.  This is because when the streaming output started, the streaming
output would initially pair video encoder A with audio encoder A before
the encoders actually fully started up (as the encoders do not fully
start up until a connection is successfully made), and when the
recording output starts up after that disconnection, audio encoder A
will wait for video encoder A rather than video encoder B because that
pairing was never actually cleared.

So, instead of pairing encoders when the output starts, wait until the
encoders themselves are being started and then pair the encoders at that
point in time.  This ensures that the encoders start up and will clear
their pairing when no longer in use.
2019-05-22 00:37:12 -07:00
James Park
ab8e895b12 libobs: Remove unreachable YUV decode paths
A previous refactoring to make DrawMatrix unnecessary has left behind
unreachable YUV conversions. Even if this code was somehow reachable,
DrawMatrix for YUV -> RGB doesn't exist anymore, so they would render
incorrectly anyway.
2019-05-19 21:27:18 -07:00