vec4f.m in common.h changed from "__m128" to "float __attribute__ ((vector_size (16)))"
All the built-ins use gcc names except for moveaps, which is intel's _mm_store_ps.
got some inline assmembly from http://www.cortstratton.org/articles/HugiCode.html#bm5 and hacked it into gcc extended-inline
gcc doesn't like it, so I will redo it with built-ins next commit. This is just for reference.
- moved source into ./source/ and headers into ./include
- updated the makefiles to support this
- added HEADERS_LUA flag to support easily specifying the location of your lua5.1 headers