147 Commits

Author SHA1 Message Date
Frank Denis
0adc144f88 std/crypto: adjust aesni parallelism to CPU models
Intel keeps changing the latency & throughput of the aes* and clmul
instructions every time they release a new model.

Adjust `optimal_parallel_blocks` accordingly, keeping 8 as a safe
default for unknown data.
2020-10-28 21:44:00 +02:00
Frank Denis
ea45897fcc PascalCase *box names, remove unneeded comptime & parenthesis
Also rename (salsa20|chacha20)Internal() to a better name.

And sort reexported crypto.* names
2020-10-28 21:43:15 +02:00
Žiga Željko
7c2bde1f07 std/crypto: API cleanup 2020-10-26 19:19:34 -04:00
Frank Denis
74a1175d9d std/*: add missing MIT license headers 2020-10-26 17:41:29 +01:00
Frank Denis
72064eba23 std/crypto: vectorize BLAKE3
Gives a ~40% speedup on x86_64.

However, the generic code remains faster on aarch64.

This is still processing only one block at a time for now.

I'm pretty confident that processing more blocks per round
will eventually give a substantial performance improvement on
all platforms with vector units.
2020-10-25 21:13:14 -04:00
Frank Denis
1b4ab749cf std/crypto: add the bcrypt password hashing function
The bcrypt function intentionally requires quite a lot of CPU cycles
to complete.

In addition to that, not having its full state constantly in the
CPU L1 cache causes a massive performance drop.

These properties slow down brute-force attacks against low-entropy
inputs (typically passwords), and GPU-based attacks get little
to no advantages over CPUs.
2020-10-25 21:11:40 -04:00
Frank Denis
0c7a99b38d Move ed25519 key pairs to a KeyPair structure 2020-10-25 21:55:05 +01:00
Frank Denis
28fb97f188 Add (X)Salsa20 and NaCl boxes
The NaCl constructions are available in pretty much all programming
languages, making them a solid choice for applications that require
interoperability.

Go includes them in the standard library, JavaScript has the popular
tweetnacl.js module, and reimplementations and ports of TweetNaCl
have been made everywhere.

Zig has almost everything that NaCl has at this point, the main
missing component being the Salsa20 cipher, on top on which NaCl's
secretboxes, boxes, and sealedboxes can be implemented.

So, here they are!

And clean the X25519 API up a little bit by the way.
2020-10-25 18:04:12 +01:00
Frank Denis
91a1c20e74 Fix a typo (s/multple/multiple/) 2020-10-24 07:57:34 +02:00
Frank Denis
047599928a Add a benchmark for signature verifications 2020-10-22 09:58:26 +02:00
Frank Denis
2d9befe9bf Implement multiscalar edwards25519 point multiplication 2020-10-22 09:58:26 +02:00
Frank Denis
0fb6fdd7eb Support variable-time edwards25519 scalar multiplication
This is useful to save some CPU cycles when the scalar is public,
such as when verifying signatures.
2020-10-22 09:58:26 +02:00
Frank Denis
ff658abe79 std/crypto/25519: use Barrett reduction for scalars (mod l) 2020-10-22 09:58:26 +02:00
Frank Denis
8e79b3cf23 std/crypto/25519: add support for batch Ed25519 signature verification 2020-10-22 09:58:26 +02:00
Frank Denis
fa17447090 std/crypto: make the whole APIs more consistent
- use `PascalCase` for all types. So, AES256GCM is now Aes256Gcm.
- consistently use `_length` instead of mixing `_size` and `_length` for the
constants we expose
- Use `minimum_key_length` when it represents an actual minimum length.
Otherwise, use `key_length`.
- Require output buffers (for ciphertexts, macs, hashes) to be of the right
size, not at least of that size in some functions, and the exact size elsewhere.
- Use a `_bits` suffix instead of `_length` when a size is represented as a
number of bits to avoid confusion.
- Functions returning a constant-sized slice are now defined as a slice instead
of a pointer + a runtime assertion. This is the case for most hash functions.
- Use `camelCase` for all functions instead of `snake_case`.

No functional changes, but these are breaking API changes.
2020-10-17 18:53:08 -04:00
Frank Denis
0b4a5254fa Vectorize Gimli 2020-10-16 18:41:11 -04:00
Frank Denis
51a3d0603c std.rand: set DefaultCsprng to Gimli, and require a larger seed
`DefaultCsprng` is documented as a cryptographically secure RNG.

While `ISAAC` is a CSPRNG, the variant we have, `ISAAC64` is not.
A 64 bit seed is a bit small to satisfy that claim.

We also saw it being used with the current date as a seed, that
also defeats the point of a CSPRNG.

Set `DefaultCsprng` to `Gimli` instead of `ISAAC64`, rename
the parameter from `init_s` to `secret_seed` + add a comment to
clarify what kind of seed is expected here.

Instead of directly touching the internals of the Gimli implementation
(which can change/be architecture-specific), add an `init()` function
to the state.

Our Gimli-based CSPRNG was also not backtracking resistant. Gimli
is a permutation; it can be reverted. So, if the state was ever leaked,
future secrets, but also all the previously generated ones could be
recovered. Clear the rate after a squeeze in order to prevent this.

Finally, a dumb test was added just to exercise `DefaultCsprng` since
we don't use it anywhere.
2020-10-15 20:57:16 -04:00
Frank Denis
cb44f27104 std/crypto/hmac: remove HmacBlake2s256 definition
HMAC is a generic construction, so we allow it to be instantiated
with any hash function.

In practice, HMAC is almost exclusively used with MD5, SHA1 and SHA2,
so it makes sense to define some shortcuts for them.

However, defining `HmacBlake2s256` is a bit weird (and why
specifically that one, and not other hash functions we also support?).
There would be nothing wrong with that construction, but it's not
used in any standard protocol and would be a curious choice.

BLAKE2 being a keyed hash function, it doesn't need HMAC to be used as
a MAC, so that also doesn't make it a good example of a possible hash
function for HMAC.

This commit doesn't remove the ability to use a Hmac(Blake2s256) type
if, for some reason, applications really need this, but it removes
HmacBlake2s256 as a constant.
2020-10-15 20:50:34 -04:00
Frank Denis
f3667e8a80 std/crypto/25519: do cofactored ed25519 verification
This is slightly slower but makes our verification function compatible
with batch signatures. Which, in turn, makes blockchain people happy.
And we want to make our users happy.

Add convenience functions to substract edwards25519 points and to
clear the cofactor.
2020-10-15 18:49:10 -04:00
Frank Denis
9f109ba0eb Simpler ChaCha20 vector code 2020-10-10 22:45:41 +02:00
Frank Denis
459128e059 Use an array of comptime_int for shuffle masks
Suggested by @LemonBoy - Thanks!
2020-10-10 22:45:41 +02:00
Frank Denis
9b386bda33 std/crypto: add a vectorized ChaCha20 implementation
Brings a 30% speed boost on x86_64 even though we still process only
one block at a time for now.

Only enabled on x86_64 since the non-vectorized implementation seems
to currently perform better on some architectures (at least on aarch64).

But the non-vectorized implementation still gets a little speed boost
as well (~17%) with these changes.
2020-10-10 22:45:41 +02:00
Andrew Kelley
b02341d6f5
Merge pull request #6614 from jedisct1/aes-arm
std/crypto/aes: add AES hardware acceleration on aarch64
2020-10-08 18:09:40 -04:00
Frank Denis
1bc2b68916 ghash: add pmull support on aarch64 2020-10-08 18:09:23 -04:00
Frank Denis
60d1e675d2 aes/aesni is not based on a Go implementation, only aes/soft is
Don't blame them for our bugs :)
2020-10-08 14:55:11 +02:00
Frank Denis
f39dc00ed4 std/crypto/aes: add AES hardware acceleration on aarch64 2020-10-08 14:55:08 +02:00
Frank Denis
fb63a2cfae std/crypto: faster (mod 2^255-19) square root computation
251 squarings, 250 multiplications -> 251 squarings, 11 multiplications
2020-10-06 19:48:26 -04:00
Frank Denis
06c16f44e7 std/crypto: Add support for AES-GCM
Already pretty fast on platforms with AES-NI, even though GHASH
reduction hasn't been optimized yet, and we don't do stitching either.
2020-10-06 00:00:33 +02:00
Frank Denis
d343b75e7f ghash & poly1305: fix handling of partial blocks and add pad()
pad() aligns the next input to the first byte of a block, which is
useful to implement the IETF version of ChaCha20Poly1305 and AES-GCM.
2020-10-05 23:50:38 +02:00
Andrew Kelley
8170a3d574
Merge pull request #6463 from jedisct1/ghash
std/crypto: add GHASH implementation
2020-10-04 02:46:36 -04:00
Frank Denis
97fd0974b9 ghash: add pclmul support on x86_64 2020-10-01 02:05:11 +02:00
Frank Denis
8161de7fa4 Implement ghash aggregated reduction
Performance increases from ~400 MiB/s to 450 MiB/s at the expense of
extra code. Thus, aggregation is disabled on ReleaseSmall.

Since the multiplication cost is significant compared to the reduction,
aggregating more than 2 blocks is probably not worth it.
2020-10-01 02:05:07 +02:00
Frank Denis
f1ad94437b ghash & poly1305: use pointer to slices for keys and output 2020-10-01 02:04:30 +02:00
Frank Denis
58873ed3f9 std/crypto: add GHASH implementation
GHASH is required to implement AES-GCM.

Optimized implementations for CPUs with instructions for carry-less
multiplication will be added next.
2020-10-01 02:04:30 +02:00
Frank Denis
d75d6e7f77 Remove unused var, sort std.crypto.* 2020-09-30 01:39:55 +02:00
Frank Denis
6eaba61ef5 std/crypto: implement the HKDF construction 2020-09-30 01:39:55 +02:00
Andrew Kelley
a1ae3f92c1
Merge pull request #6442 from jedisct1/aegis
std/crypto: add the AEGIS AEADs
2020-09-29 15:18:06 -04:00
Frank Denis
8d67f15d36 aegis: add test vectors, and link to the latest version of the spec 2020-09-29 17:10:04 +02:00
Frank Denis
bb1c6bc376 Add AEGIS-256 as well 2020-09-29 17:10:04 +02:00
Frank Denis
9f274e1f7d std/crypto: add the AEGIS128L AEAD
Showcase that Zig can be a great option for high performance cryptography.

The AEGIS family of authenticated encryption algorithms was selected for
high-performance applications in the final portfolio of the CAESAR
competition.

They reuse the AES core function, but are substantially faster than the
CCM, GCM and OCB modes while offering a high level of security.

AEGIS algorithms are especially fast on CPUs with built-in AES support, and
the 128L variant fully takes advantage of the pipeline in modern Intel CPUs.

Performance of the Zig implementation is on par with libsodium.
2020-09-29 17:10:04 +02:00
Frank Denis
56d820087d gimli: make permute a constant, remove leading underscore 2020-09-29 14:01:08 +02:00
Frank Denis
4194714965 Don't unroll the gimli permutation on release-small 2020-09-29 13:23:04 +02:00
Frank Denis
613f8fe83f Use mem.copy() instead of manual iterations 2020-09-29 10:23:00 +02:00
Frank Denis
868a46eb43 std/crypto: make gimli slightly faster
Before:
       gimli-hash:        120 MiB/s
       gimli-aead:        130 MiB/s

After:
       gimli-hash:        195 MiB/s
       gimli-aead:        208 MiB/s

Also fixes in-place decryption by the way.

If the input & output buffers were the same, decryption used to fail.

Return on decryption error in the benchmark to detect similar issues
in future AEADs even in non release-fast mode.
2020-09-29 00:29:20 +02:00
Frank Denis
bd89bd6fdb Revamp crypto/aes
* Reorganize crypto/aes in order to separate parameters, implementations and
modes.
* Add a zero-cost abstraction over the internal representation of a block,
so that blocks can be kept in vector registers in optimized implementations.
* Add architecture-independent aesenc/aesdec/aesenclast/aesdeclast operations,
so that any AES-based primitive can be implemented, including these that don't
use the original key schedule (AES-PRF, AEGIS, MeowHash...)
* Add support for parallelization/wide blocks to take advantage of hardware
implementations.
* Align T-tables to cache lines in the software implementations to slightly
reduce side channels.
* Add an optimized implementation for modern Intel CPUs with AES-NI.
* Add new tests (AES256 key expansion).
* Reimplement the counter mode to work with any block cipher, any endianness
and to take advantage of wide blocks.
* Add benchmarks for AES.
2020-09-24 13:16:00 -04:00
Andrew Kelley
f125288c9b
Merge pull request #6336 from Rocknest/pbkdf2
Some changes to #6326 (pbkdf2)
2020-09-17 17:31:58 -04:00
Andrew Kelley
281fc10ec5 std.crypto siphash: fix assertion on the size of output buffer
the logic was backwards
2020-09-16 02:24:36 -07:00
Rocknest
c35703825f Add an error set 2020-09-16 01:58:48 +03:00
Rocknest
988fc6f9d1 flip condition 2020-09-14 02:27:09 +03:00
Rocknest
73863cf72b fix build 2020-09-13 23:59:36 +03:00