GS_RGBA, GS_BGRX, and GS_BGRA now use TYPELESS DXGI formats, so we can
alias them between UNORM and UNORM_SRGB as necessary. GS_RGBA_UNORM,
GS_BGRX_UNORM, and GS_BGRA_UNORM have been added to support straight
UNORM types, which Windows requires for sharing textures from D3D9 and
OpenGL. The D3D path aliases via views, and GL aliases via
GL_EXT_texture_sRGB_decode/GL_FRAMEBUFFER_SRGB.
A significant amount of code has changed in the D3D/GL backends, but the
concepts are simple. On the D3D side, we need separate SRVs and RTVs to
support nonlinear/linear reads and writes. On the GL side, we need to
set the proper GL parameters to emulate the same.
Add gs_enable_framebuffer_srgb/gs_framebuffer_srgb_enabled to set/get
the framebuffer as SRGB or not.
Add gs_linear_srgb_active/gs_set_linear_srgb to instruct sources that
they should render as SRGB. Legacy sources can ignore this setting
without regression.
Update obs_source_draw to use linear SRGB as needed.
Update render_filter_tex to use linear SRGB as needed.
Add gs_effect_set_texture_srgb next to gs_effect_set_texture to set
texture with SRGB view instead.
Add SRGB helpers for vec4 struct.
Create GDI-compatible textures without SRGB support. Doesn't seem to
work with SRGB formats.
Attempt to fix threading issues that cause OBS to force-crash when
compiled with latest Xcode. There are two places where the new SDK
introduces a force-crash because operations are not happening on the
main thread: when we modify the context view to switch swap chains, and
when we resize a swap chain.
Instead of using just one context for all rendering, we create an
additional context for each swap chain, set each view once on
initialization, and switch contexts only in present to blit the final
framebuffer. This is an extra copy, but it's pretty hairy to optimize
away, and it's not worth potential regressions just to speed up Mac.
For resizing, we schedule the update code to run on the main thread from
the render thread. Ideally, we wouldn't have to round trip the logic
from main thread to graphics thread and back, but I don't think we want
to hack up the interface for this, especially since OpenGL will give way
to Metal soon enough.
We really shouldn't be resetting duplicator state as part of gs_flush.
gs_begin_scene is not ideal because it is called twice per frame, and
only after duplicators have been ticked. Even though it makes no
user-facing difference, it makes more logical sense to reset at the top
of the frame than the bottom.
There don't appear to be any GPUs that support 3.2, but not 3.3. GLSL
330 maps to HLSL Shader Model 4, so this will theoretically make shaders
programs less likely to diverge, particularly for behavior around NaNs.
This change only wraps the functionality. I have rough code to exercise
the the query functionality, but that part is not really clean enough to
submit.
Code submissions have continually suffered from formatting
inconsistencies that constantly have to be addressed. Using
clang-format simplifies this by making code formatting more consistent,
and allows automation of the code formatting so that maintainers can
focus more on the code itself instead of code formatting.
The cache coherency of rasterization for full-screen passes is better
using an oversized triangle that is clipped rather than two triangles.
Traversal order of rasterization is GPU-specific, but will almost
certainly be better using an undivided primitive.
A smaller benefit is that quads along the diagonal are not evaluated
multiple times, but that's minor in comparison.
Redo format shaders to bypass vertex buffer, and input layout. Add
global shader bool "obs_glsl_compile" to make API-specific decisions,
i.e. handle upside-down UVs. gl_ortho is not needed for format
conversion because the vertex shader does not use ViewProj anymore.
This can be applied to more situations, but start small first.
Testbed full screen passes, Intel HD Graphics 530:
RGBA -> UYVX: 467 -> 439 us, ~6% savings
UYVX -> uv: 295 -> 239 us, ~19% savings
Add support for debug markers via D3DPERF API and KHR_debug. This makes
it easier to understand RenderDoc captures.
D3DPERF is preferred to ID3DUserDefinedAnnotation because it supports
colors. d3d9.lib is now linked in to support this.
This feature is disabled by default, and is controlled by
GS_USE_DEBUG_MARKERS.
From: obsproject/obs-studio#1799
The previous model stored a new FBO per texture width/height/format on
a array in the device struct. This allocated memory was only released
on gs_device_destroy (obs exit).
The new approach stores a FBO on gs_texture and the its info is
destroyed once the texture is deleted.
This commit logs the OpenGL version on all operating systems and brings
the OpenGL subsystem's initialization logging more in line with the
D3D11 logging to make logs more uniform.
To be able to use index buffers, they must also be bound to a vertex
array object along with the vertex buffers.
Ideally, if there are multiple index buffers for a vertex buffer,
separate VAOs should be created for each combination.
glFlush is somewhat implementation-specific; on OSX for example, it is
additionally used to draw to a view. However, we're already using the
Objective-C function flushBuffer, which apparently calls glFlush
internally anyway to draw to views. That means that we're superfluously
calling glFlush most of the time if there's an active swap chain. So
instead, only call glFlush when there are no active swap chains on OSX.
This allows the ability to separate the blend states of color and alpha.
The default blend state has also changed so that alpha is always added
together to ensure that the destination image always gets an alpha value
that is actually usable after the operation (for render targets).
Old default state:
color source: GS_BLEND_SRCALPHA, color dest: GS_BLEND_INVSRCALPHA
alpha source: GS_BLEND_SRCALPHA, alpha dest: GS_BLEND_INVSRCALPHA
New default state:
color source: GS_BLEND_SRCALPHA, color dest: GS_BLEND_INVSRCALPHA
alpha source: GS_BLEND_ONE, alpha dest: GS_BLEND_ONE
If a render target was switched from one to another and then back
consecutively, the texture would not get reattached at the last point
due to the fact that the cur_render_target variable of the FBO storage
structure would already be set to that texture, and then it would never
get reattached because it thought it was already using that render
target.
So to fix this, any time a render target legitimately changes from one
to another, it sets the cur_render_target and cur_zstencil_buffer
variables of the FBO structure to null.
When render targets are used, they output to the render target inverted
due to the way that opengl works. This fixes that issue by inverting
the projection matrix so that it renders the image upside down and
inverting the front face from counterclockwise to clockwise.
The glDebugMessageCallback function will set a callback that will relay
all messages coming from the driver (on supported drivers at least).
For nvidia, sometimes there are a lot of irrelevant messages, which is
nice depending on the type of application you're writing.
I actually at first thought these messages were important, but it turns
out that they're almost always irrelevant and not actually warnings.
Most of the messages at a certain type/severity are mostly for debugging
purposes or minor hints about how to maximize performance, so
unfortunately there ends up being a lot of pointless spam in the debug
output.
The particular performance messages are related to optimizations you can
get via multithreading, and are optimizations you would expect for games
more than for this type of application. They don't really apply for our
use cases most of the time.
High severity messages however are not omitted regardless of message
type.
These messages can be enabled again by simply defining the
SHOW_ALL_GL_MESSAGES macro.
This Fixes a minor flaw with the API where data had to always be mutable
to be usable by the API.
Functions that do not modify the fundamental underlying data of a
structure should be marked as constant, both for safety and to signify
that the parameter is input only and will not be modified by the
function using it.
Typedef pointers are unsafe. If you do:
typedef struct bla *bla_t;
then you cannot use it as a constant, such as: const bla_t, because
that constant will be to the pointer itself rather than to the
underlying data. I admit this was a fundamental mistake that must
be corrected.
All typedefs that were pointer types will now have their pointers
removed from the type itself, and the pointers will be used when they
are actually used as variables/parameters/returns instead.
This does not break ABI though, which is pretty nice.
This replaces the ARB_separate_shader_objects extension with traditional
linked shaders. I was able to get the existing system to use linked
shaders without having to change any libobs graphics API.
This essentially creates a linked list of shader programs with
references to the shaders they link. Before draw, it searches that
linked list for a particular pixel/vertex shader pair, and the linked
program associated with it. If no matching program exists, it creates
the program.
Changed API functions:
libobs: obs_reset_video
Before, video initialization returned a boolean, but "failed" is too
little information, if it fails due to lack of device capabilities or
bad video device parameters, the front-end needs to know that.
The OBS Basic UI has also been updated to reflect this API change.