Currently, slices are passed via reference, even though it would be
better to pass the ptr and len as separate arguments (#561). This means
that any function call with a slice parameter cannot be a tail call,
because according to LLVM spec:
> Both [tail,musttail] markers imply that the callee does not access
> allocas from the caller
There was one other place we were setting `tail` and I made that
conditional on whether or not the argument referenced allocas in the
caller.
This was causing undefined behavior in the compiler when it hit asserts,
causing it to print garbage memory to the terminal. See #3262 for
example.
/home/shawn/git/zig-simd/build/lib/zig/std/special/start.zig:23:40: error: exported symbol collision: '_start'
@export("_start", _start, .Strong);
^
/home/shawn/git/zig-simd/build/d.zig:1:1: note: other symbol is here
pub export fn _start() void {
^
/home/shawn/git/zig-simd/build/lib/zig/std/special/start.zig:124:35: error: root source file has no member called 'main'
switch (@typeInfo(@typeOf(root.main).ReturnType)) {
* Fix codegen for splat - instead of giving vectors of length N
to shufflevector for both of the operands, it gives vectors of length
1. The mask vector is the only one that needs N elements.
* Separate Splat into SplatSrc and SplatGen; the `len` is not needed
once it gets to codegen since it is redundant with the result type.
* Refactor compile error for wrong vector element type so that the
compile error message is not duplicated in zig source code
* Improve implementation to correctly handle comptime values such as
undefined and lazy values.
* Improve compile error for bad vector element type to point to the
correct place.
* Delete dead code.
* Modify behavior test to use an array cast instead of vector element
indexing since I'm merging this splat commit out-of-order from
Shawn's patch set.
* update docs for `@byteSwap`.
* fix hash & eql functions for ZigLLVMFnIdBswap not updated to
include vector len. this was causing incorrect bswap function
being called in unrelated code
* fix `@byteSwap` behavior tests only testing comptime and not
runtime operations
* implement runtime `@byteSwap`
* fix incorrect logic in ir_render_vector_to_array and
ir_render_array_to_vector with regards to whether or not to bitcast
* `@byteSwap` accepts an array operand which it will cast to vector
* simplify `@byteSwap` semantic analysis code and various fixes
* update documentation
- move `@shuffle` to be sorted alphabetically
- remove mention of LLVM
- minor clarifications & rewording
* introduce ir_resolve_vector_elem_type to avoid duplicate compile
error message and duplicate vector element checking logic
* rework ir_analyze_shuffle_vector to solve various issues
* improve `@shuffle` to allow implicit cast of arrays
* the shuffle tests weren't being run
I change the semantics of the mask operand, to make it a little more
flexible. There is no real danger in this because it is a compile-error
if you do it the LLVM way (and there is an appropiate error to tell you
this).
v2: avoid problems with double-free
The question was:
> // TODO do we need lazy values on vector comparisons?
Nope, in fact the existing code already was returning ErrorNotLazy
for that particular type, and would already goto
never_mind_just_calculate_it_normally. So the explicit check for
ZigTypeIdVector is not needed. I appreciate the caution though.
* bitcasting is still better when the size_in_bits aligns with the ABI
size of the element type. Logic is reworked to do bitcasting when
possible
* rather than using insertelement/extractelement to work with arrays,
store/load elements directly. This matches codegen for arrays
elsewhere.
See musl commit 05870abeaac0588fb9115cfd11f96880a0af2108
by Rich Felker.
Commit message from musl reproduced here:
fix code path where child function returns in arm __clone built as thumb
mov lr,pc is not a valid way to save the return address in thumb mode
since it omits the thumb bit. use a chain of bl and bx to emulate blx.
this could be avoided by converting to a .S file with preprocessor
conditions to use blx if available, but the time cost here is
dominated by the syscall anyway.
while making this change, also remove the remnants of support for
pre-bx ISA levels. commit 9f290a49bf9ee247d540d3c83875288a7991699c
removed the hack from the parent code paths, but left the unnecessary
code in the child. keeping it would require rewriting two code paths
rather than one, and is useless for reasons described in that commit.