Showcase that Zig can be a great option for high performance cryptography.
The AEGIS family of authenticated encryption algorithms was selected for
high-performance applications in the final portfolio of the CAESAR
competition.
They reuse the AES core function, but are substantially faster than the
CCM, GCM and OCB modes while offering a high level of security.
AEGIS algorithms are especially fast on CPUs with built-in AES support, and
the 128L variant fully takes advantage of the pipeline in modern Intel CPUs.
Performance of the Zig implementation is on par with libsodium.
Before:
gimli-hash: 120 MiB/s
gimli-aead: 130 MiB/s
After:
gimli-hash: 195 MiB/s
gimli-aead: 208 MiB/s
Also fixes in-place decryption by the way.
If the input & output buffers were the same, decryption used to fail.
Return on decryption error in the benchmark to detect similar issues
in future AEADs even in non release-fast mode.
* Reorganize crypto/aes in order to separate parameters, implementations and
modes.
* Add a zero-cost abstraction over the internal representation of a block,
so that blocks can be kept in vector registers in optimized implementations.
* Add architecture-independent aesenc/aesdec/aesenclast/aesdeclast operations,
so that any AES-based primitive can be implemented, including these that don't
use the original key schedule (AES-PRF, AEGIS, MeowHash...)
* Add support for parallelization/wide blocks to take advantage of hardware
implementations.
* Align T-tables to cache lines in the software implementations to slightly
reduce side channels.
* Add an optimized implementation for modern Intel CPUs with AES-NI.
* Add new tests (AES256 key expansion).
* Reimplement the counter mode to work with any block cipher, any endianness
and to take advantage of wide blocks.
* Add benchmarks for AES.
Move block definitions inside while loop.
Use usize for offset. (This still crashes on overflow)
Remove unneeded slice syntax.
Add slow test for Very large dkLen
- 1MiB objects on the stack doesn't play well with wasmtime.
Reduce these to 512KiB so that the webassembly benchmarks can run.
- Pass expected results to a blackBox() function. Without this, in
release-fast mode, the compiler could detected unused return values,
and would produce results that didn't make sense for siphash.
- Add AEAD constructions to the benchmarks.
- Inline chacha20Core() makes it 4 times faster.
- benchmarkSignatures() -> benchmarkSignature() for consistency.
SipHash *is* a cryptographic function, with a 128-bit security level.
However, it is not a regular hash function: a secret key is required,
and knowledge of that key allows collisions to be quickly computed offline.
SipHash is therefore more suitable to be used as a MAC.
The same API as other MACs was implemented in addition to functions directly
returning an integer.
The benchmarks have been updated accordingly.
No changes to the SipHash implementation itself.
- This avoids having multiple `init()` functions for every combination
of optional parameters
- The API is consistent across all hash functions
- New options can be added later without breaking existing applications.
For example, this is going to come in handy if we implement parallelization
for BLAKE2 and BLAKE3.
- We don't have a mix of snake_case and camelCase functions any more, at
least in the public crypto API
Support for BLAKE2 salt and personalization (more commonly called context)
parameters have been implemented by the way to illustrate this.
Justification:
- reset() is unnecessary; states that have to be reused can be copied
- reset() is error-prone. Copying a previous state prevents forgetting
struct members.
- reset() forces implementation to store sensitive data (key, initial state)
in memory even when they are not needed.
- reset() is confusing as it has a different meaning elsewhere in Zig.
Instead of having all primitives and constructions share the same namespace,
they are now organized by category and function family.
Types within the same category are expected to share the exact same API.
* Factor redundant code in std/crypto/chacha20
* Add support for XChaCha20, and the XChaCha20-Poly1305 construction.
XChaCha20 is a 24-byte version of ChaCha20, is widely implemented
and is on the standards track:
https://tools.ietf.org/html/draft-irtf-cfrg-xchacha-03
* Add support for encryption/decryption with the authentication tag
detached from the ciphertext
* Add wrappers with an API similar to the Gimli AEAD type, so that
we can use and benchmark AEADs with a common API.