Attempt to fix threading issues that cause OBS to force-crash when
compiled with latest Xcode. There are two places where the new SDK
introduces a force-crash because operations are not happening on the
main thread: when we modify the context view to switch swap chains, and
when we resize a swap chain.
Instead of using just one context for all rendering, we create an
additional context for each swap chain, set each view once on
initialization, and switch contexts only in present to blit the final
framebuffer. This is an extra copy, but it's pretty hairy to optimize
away, and it's not worth potential regressions just to speed up Mac.
For resizing, we schedule the update code to run on the main thread from
the render thread. Ideally, we wouldn't have to round trip the logic
from main thread to graphics thread and back, but I don't think we want
to hack up the interface for this, especially since OpenGL will give way
to Metal soon enough.
We really shouldn't be resetting duplicator state as part of gs_flush.
gs_begin_scene is not ideal because it is called twice per frame, and
only after duplicators have been ticked. Even though it makes no
user-facing difference, it makes more logical sense to reset at the top
of the frame than the bottom.
There don't appear to be any GPUs that support 3.2, but not 3.3. GLSL
330 maps to HLSL Shader Model 4, so this will theoretically make shaders
programs less likely to diverge, particularly for behavior around NaNs.
This change only wraps the functionality. I have rough code to exercise
the the query functionality, but that part is not really clean enough to
submit.
When mixing sampling with raw loads in a shader, ending a shader with a
load would case the default sampler to become unset for OpenGL. Instead,
initialize with no sampler, and only set if there is a sampler.
glGetError() returns GL_INVALID_OPERATION during OBS shutdown when GL is
used on Windows. This change gives up after eight errors.
This could be avoided by stopping the graphics thread before window
destruction, but the shutdown code looks like it could be tricky to
reorder.
Code submissions have continually suffered from formatting
inconsistencies that constantly have to be addressed. Using
clang-format simplifies this by making code formatting more consistent,
and allows automation of the code formatting so that maintainers can
focus more on the code itself instead of code formatting.
The cache coherency of rasterization for full-screen passes is better
using an oversized triangle that is clipped rather than two triangles.
Traversal order of rasterization is GPU-specific, but will almost
certainly be better using an undivided primitive.
A smaller benefit is that quads along the diagonal are not evaluated
multiple times, but that's minor in comparison.
Redo format shaders to bypass vertex buffer, and input layout. Add
global shader bool "obs_glsl_compile" to make API-specific decisions,
i.e. handle upside-down UVs. gl_ortho is not needed for format
conversion because the vertex shader does not use ViewProj anymore.
This can be applied to more situations, but start small first.
Testbed full screen passes, Intel HD Graphics 530:
RGBA -> UYVX: 467 -> 439 us, ~6% savings
UYVX -> uv: 295 -> 239 us, ~19% savings
Add support for debug markers via D3DPERF API and KHR_debug. This makes
it easier to understand RenderDoc captures.
D3DPERF is preferred to ID3DUserDefinedAnnotation because it supports
colors. d3d9.lib is now linked in to support this.
This feature is disabled by default, and is controlled by
GS_USE_DEBUG_MARKERS.
From: obsproject/obs-studio#1799