We currently have ciphers optimized for performance, for
compatibility, for size and for specific CPUs.
However we lack a class of ciphers that is becoming increasingly
important, as Zig is being used for embedded systems, but also as
hardware-level side channels keep being found on (Intel) CPUs.
Here is ISAPv2, a construction specifically designed for resilience
against leakage and fault attacks.
ISAPv2 is obviously not optimized for performance, but can be an
option for highly sensitive data, when the runtime environment cannot
be trusted.
This is a trivial implementation that just does a or[xor] loop.
However, this pattern is used by virtually all crypto libraries and
in practice, even without assembly barriers, LLVM never turns it into
code with conditional jumps, even if one of the parameters is constant.
This has been verified to still be the case with LLVM 11.0.0.
With the simple rule that whenever we have or will have 2 similar
functions, they should be in their own namespace.
Some of these new namespaces currently contain a single function.
This is to prepare for reduced-round versions that are likely to
be added later.
The bcrypt function intentionally requires quite a lot of CPU cycles
to complete.
In addition to that, not having its full state constantly in the
CPU L1 cache causes a massive performance drop.
These properties slow down brute-force attacks against low-entropy
inputs (typically passwords), and GPU-based attacks get little
to no advantages over CPUs.
The NaCl constructions are available in pretty much all programming
languages, making them a solid choice for applications that require
interoperability.
Go includes them in the standard library, JavaScript has the popular
tweetnacl.js module, and reimplementations and ports of TweetNaCl
have been made everywhere.
Zig has almost everything that NaCl has at this point, the main
missing component being the Salsa20 cipher, on top on which NaCl's
secretboxes, boxes, and sealedboxes can be implemented.
So, here they are!
And clean the X25519 API up a little bit by the way.
- use `PascalCase` for all types. So, AES256GCM is now Aes256Gcm.
- consistently use `_length` instead of mixing `_size` and `_length` for the
constants we expose
- Use `minimum_key_length` when it represents an actual minimum length.
Otherwise, use `key_length`.
- Require output buffers (for ciphertexts, macs, hashes) to be of the right
size, not at least of that size in some functions, and the exact size elsewhere.
- Use a `_bits` suffix instead of `_length` when a size is represented as a
number of bits to avoid confusion.
- Functions returning a constant-sized slice are now defined as a slice instead
of a pointer + a runtime assertion. This is the case for most hash functions.
- Use `camelCase` for all functions instead of `snake_case`.
No functional changes, but these are breaking API changes.
Showcase that Zig can be a great option for high performance cryptography.
The AEGIS family of authenticated encryption algorithms was selected for
high-performance applications in the final portfolio of the CAESAR
competition.
They reuse the AES core function, but are substantially faster than the
CCM, GCM and OCB modes while offering a high level of security.
AEGIS algorithms are especially fast on CPUs with built-in AES support, and
the 128L variant fully takes advantage of the pipeline in modern Intel CPUs.
Performance of the Zig implementation is on par with libsodium.
* Reorganize crypto/aes in order to separate parameters, implementations and
modes.
* Add a zero-cost abstraction over the internal representation of a block,
so that blocks can be kept in vector registers in optimized implementations.
* Add architecture-independent aesenc/aesdec/aesenclast/aesdeclast operations,
so that any AES-based primitive can be implemented, including these that don't
use the original key schedule (AES-PRF, AEGIS, MeowHash...)
* Add support for parallelization/wide blocks to take advantage of hardware
implementations.
* Align T-tables to cache lines in the software implementations to slightly
reduce side channels.
* Add an optimized implementation for modern Intel CPUs with AES-NI.
* Add new tests (AES256 key expansion).
* Reimplement the counter mode to work with any block cipher, any endianness
and to take advantage of wide blocks.
* Add benchmarks for AES.
Password hashing functions are not general-purpose KDFs, and KDFs
don't have to satisfy the same properties as a PHF.
This will allow fast KDFs such as the HKDF construction to be in a
category of their own, while clarifying what functions are suitable
for using passwords as inputs.
SipHash *is* a cryptographic function, with a 128-bit security level.
However, it is not a regular hash function: a secret key is required,
and knowledge of that key allows collisions to be quickly computed offline.
SipHash is therefore more suitable to be used as a MAC.
The same API as other MACs was implemented in addition to functions directly
returning an integer.
The benchmarks have been updated accordingly.
No changes to the SipHash implementation itself.
- This avoids having multiple `init()` functions for every combination
of optional parameters
- The API is consistent across all hash functions
- New options can be added later without breaking existing applications.
For example, this is going to come in handy if we implement parallelization
for BLAKE2 and BLAKE3.
- We don't have a mix of snake_case and camelCase functions any more, at
least in the public crypto API
Support for BLAKE2 salt and personalization (more commonly called context)
parameters have been implemented by the way to illustrate this.
Justification:
- reset() is unnecessary; states that have to be reused can be copied
- reset() is error-prone. Copying a previous state prevents forgetting
struct members.
- reset() forces implementation to store sensitive data (key, initial state)
in memory even when they are not needed.
- reset() is confusing as it has a different meaning elsewhere in Zig.
Instead of having all primitives and constructions share the same namespace,
they are now organized by category and function family.
Types within the same category are expected to share the exact same API.
This is a rewrite of the x25519 code, that generalizes support for
common primitives based on the same finite field.
- Low-level operations can now be performed over the curve25519 and
edwards25519 curves, as well as the ristretto255 group.
- Ed25519 signatures have been implemented.
- X25519 is now about twice as fast.
- mem.timingSafeEqual() has been added for constant-time comparison.
Domains have been clearly separated, making it easier to later add
platform-specific implementations.
This is a translation of the [official reference implementation][1] with
few other changes. The bad news is that the reference implementation is
designed for simplicity and not speed, so there's a lot of room for
performance improvement. The good news is that, according to the crypto
benchmark, the implementation is still fast relative to the other
hashing algorithms:
```
md5: 430 MiB/s
sha1: 386 MiB/s
sha256: 191 MiB/s
sha512: 275 MiB/s
sha3-256: 233 MiB/s
sha3-512: 137 MiB/s
blake2s: 464 MiB/s
blake2b: 526 MiB/s
blake3: 576 MiB/s
poly1305: 1479 MiB/s
hmac-md5: 653 MiB/s
hmac-sha1: 553 MiB/s
hmac-sha256: 222 MiB/s
x25519: 8685 exchanges/s
```
[1]: https://github.com/BLAKE3-team/BLAKE3