Turns out that on some adapters, due to some sort of internal GPU
precision error, fmod(x, y) can return x when x == y, wich is incorrect
(and no, they were actually equal, not off due to precision errors).
This would cause the shader to sample wrong coordinates on the edges
sometimes. Just adding 0.1 to the x value before being put in to fmod
and then flooring the result after fixes the issue.
- Changed glMapBuffer to glMapBufferRange to allow invalidation. Using
just glMapBuffer alone was causing some unacceptable stalls.
- Changed dynamic buffers from GL_DYNAMIC_WRITE to GL_STREAM_WRITE
because I had misunderstood the OpenGL specification
- Added _OPENGL and _D3D11 builtin preprocessor macros to effects to
allow special processing if needed
- Added fmod support to shaders (NOTE: D3D and GL do not function
identically with negative numbers when using this. Positive numbers
however function identically)
- Created a planar conversion shader that converts from packed YUV to
planar 420 right on the GPU without any CPU processing. Reduces
required GPU download size to approximately 37.5% of its normal rate
as well. GPU usage down by 10 entire percentage points despite the
extra required pass.