Issue #699 since fixed. Nearly a x3 perf improvement.
Using --release-fast.
Sha3_256 (before): 96 Mb/s
Sha3_256 (after): 267 Mb/s
Sha3_512 (before): 53 Mb/s
Sha3_512 (after): 142 Mb/s
No real gains from unrolling other initialization loops in crypto
functions so have been left as is.
The purpose of this is:
* Only one way to do things
* Changing a function with void return type to return a possible
error becomes a 1 character change, subtly encouraging
people to use errors.
See #632
Here are some imperfect sed commands for performing this update:
remove arrow:
```
sed -i 's/\(\bfn\b.*\)-> /\1/g' $(find . -name "*.zig")
```
add void:
```
sed -i 's/\(\bfn\b.*\))\s*{/\1) void {/g' $(find ../ -name "*.zig")
```
Some cleanup may be necessary, but this should do the bulk of the work.
These are on the slower side and could be improved. No performance optimizations
yet have been done.
```
Cpu: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
```
-- Sha3-256
```
Zig --release-fast
93 Mb/s
Zig --release-safe
99 Mb/s
Zig
4 Mb/s
```
-- Sha3-512
```
Zig --release-fast
49 Mb/s
Zig --release-safe
54 Mb/s
Zig
2 Mb/s
```
Interestingly, release-safe is producing slightly better code than
release-fast.
Some performance comparisons to C.
We take the fastest time measurement taken across multiple runs.
The block hashing functions use the same md5/sha1 methods.
```
Cpu: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
Gcc: 7.2.1 20171224
Clang: 5.0.1
Zig: 0.1.1.304f6f1d
```
See https://www.nayuki.io/page/fast-md5-hash-implementation-in-x86-assembly:
```
gcc -O2
661 Mb/s
clang -O2
490 Mb/s
zig --release-fast and zig --release-safe
570 Mb/s
zig
50 Mb/s
```
See https://www.nayuki.io/page/fast-sha1-hash-implementation-in-x86-assembly:
```
gcc -O2
588 Mb/s
clang -O2
563 Mb/s
zig --release-fast and zig --release-safe
610 Mb/s
zig
21 Mb/s
```
In short, zig provides pretty useful tools for writing this sort of
code. We are in the lead against clang (which uses the same LLVM
backend) with us being slower only against md5 with GCC.