Also, rename atomic functions to be consistent with the rest of the
platform/threading functions, and move atomic functions to threading*
files rather than platform* files
Staging surfaces with GL originally copied to a texture and then
downloaded that copied texture, but I realized that there was really no
real need to do that. Now instead they'll copy directly from the
texture that's given to them rather than copying to a buffer first.
Secondly, hopefully fix the mac issue where the only way to perform an
asynchronous texture download is via FBOs and glReadPixels. It's a
really dumb issue with macs and the amount of "gotchas" and non-standard
internal GL functionaly on mac is really annoying.
There were a *lot* of warnings, managed to remove most of them.
Also, put warning flags before C_FLAGS and CXX_FLAGS, rather than after,
as -Wall -Wextra was overwriting flags that came before it.
- Removed the dependency on windows.h for windows. I feel it's an
unnecessarily large dependency to have to add to all source files
when the only thing that's needed to make the windows version compile
the debug functions is just the __stdcall call convention keyword.
On top of increasing compile time due to the large number of headers
it includes from all the windows API headers, it also adds a lot of
potential name conflicts, as I was getting a number of name conflicts
for lots of names like near/far, which were used in old legacy 16bit
windows code.