Ekdohibs/zig - zig - Final Minetest

Author	SHA1	Message	Date
Frank Denis	83abb32247	std/crypto - edwards25519 precomp: prefer doublings over adds Doublings are a little bit faster than additions, so use them half the time during precomputations.	2020-11-25 15:37:43 -08:00
Frank Denis	9c2b014ea8	std/crypto: use NAF for multi-scalar edwards25519 multiplication Transforming scalars to non-adjacent form shrinks the number of precomputations down to 8, while still processing 4 bits at a time. However, real-world benchmarks show that the transform is only really useful with large precomputation tables and for batch signature verification. So, do it for batch verification only.	2020-11-17 17:07:32 -08:00
Frank Denis	0d9c474ecf	std/crypto: implement the Hash-To-Curve standard for Edwards25519 https://github.com/cfrg/draft-irtf-cfrg-hash-to-curve This is quite an important feature to have since many other standards being worked on depend on this operation. Brings a couple useful arithmetic operations on field elements by the way. This PR also adds comments to the functions we expose in 25519/field so that they can appear in the generated documentation.	2020-11-17 17:06:38 -08:00
Andrew Kelley	e6bfa377d1	std.crypto.isap: fix callsites of secureZero	2020-11-16 18:10:41 -07:00
Frank Denis	f9d209787b	std/crypto: add ISAPv2 (ISAP-A-128a) AEAD We currently have ciphers optimized for performance, for compatibility, for size and for specific CPUs. However we lack a class of ciphers that is becoming increasingly important, as Zig is being used for embedded systems, but also as hardware-level side channels keep being found on (Intel) CPUs. Here is ISAPv2, a construction specifically designed for resilience against leakage and fault attacks. ISAPv2 is obviously not optimized for performance, but can be an option for highly sensitive data, when the runtime environment cannot be trusted.	2020-11-16 16:02:19 -08:00
Frank Denis	7f9e3e419c	Use @reduce	2020-11-07 20:30:13 +01:00
Frank Denis	bd07154242	Add mem.timingSafeEql() for constant-time array comparison This is a trivial implementation that just does a or[xor] loop. However, this pattern is used by virtually all crypto libraries and in practice, even without assembly barriers, LLVM never turns it into code with conditional jumps, even if one of the parameters is constant. This has been verified to still be the case with LLVM 11.0.0.	2020-11-07 20:18:43 +01:00
Frank Denis	e7b60b219b	std/crypto: don't constrain Gimli hash output to a fixed length As documented in the comment right above the finalization function, Gimli can be used as a XOF, i.e. the output doesn't have a fixed length. So, allow it to be used that way, just like BLAKE3.	2020-11-05 17:21:19 -05:00
Frank Denis	2e354c387e	math.shl/math.shr: add support for vectors	2020-11-05 17:20:54 -05:00
Frank Denis	73aef46f7d	std.crypto: namespace constructions a bit more With the simple rule that whenever we have or will have 2 similar functions, they should be in their own namespace. Some of these new namespaces currently contain a single function. This is to prepare for reduced-round versions that are likely to be added later.	2020-11-05 17:20:25 -05:00
Frank Denis	4417206230	Now that they support vectors, use math.rot{l,r}	2020-11-05 17:19:48 -05:00
Frank Denis	8d7c160fb4	Make Gimli test vector look like the python implementation	2020-11-03 09:13:14 +01:00
Frank Denis	d764636d21	Another big-endian fix for Gimli We read and write bytes directly from the state, but in the init function, we potentially endian-swap them. Initialize bytes in native format since we will be reading them in native format as well later. Also use the public interface in the "permute" test rather than an internal interface. The state itself is not meant to be accessed directly, even in tests.	2020-11-03 02:01:48 +01:00
Frank Denis	ad9655db3a	Fix Gimli for big-endian targets	2020-11-02 13:38:20 -05:00
Frank Denis	c387f1340f	std/crypto: make Hkdf functions public	2020-11-01 18:27:11 +02:00
Frank Denis	26793453a7	std/crypto/blake2b: allow the initial output length to be set BLAKE2 includes the expected output length in the initial state. This length is actually distinct from the actual output length used at finalization. BLAKE2b-256/128 is thus not the same as BLAKE2b-128. This behavior can be a little bit surprising, and has been "fixed" in BLAKE3. In order to support this, we may want to provide an option to set the length used for domain separation. In Zig, there is another reason to allow this: we assume that the output length is defined at comptime. But BLAKE2 doesn't have a fixed output length. For an output length that is not known at comptime, we can't take the full block size and truncate it due to the reason above. What we can do now is set that length as an option to get the correct initial state, and truncate the output if necessary.	2020-10-29 15:18:37 -04:00
Frank Denis	e59dd7eecf	std/crypto/x25519: return encoded points directly + ed->mont map Leverage result location semantics for X25519 like we do everywhere else in 25519/* Also add the edwards25519->curve25519 map by the way since many applications seem to use this to share the same key pair for encryption and signature.	2020-10-29 14:39:58 -04:00
Frank Denis	5764c550ed	std/crypto: vectorize Salsa20 20% faster on x86_64, slower on aarch64 as usual :/	2020-10-29 14:34:58 -04:00
Frank Denis	0adc144f88	std/crypto: adjust aesni parallelism to CPU models Intel keeps changing the latency & throughput of the aes* and clmul instructions every time they release a new model. Adjust `optimal_parallel_blocks` accordingly, keeping 8 as a safe default for unknown data.	2020-10-28 21:44:00 +02:00
Frank Denis	ea45897fcc	PascalCase box names, remove unneeded comptime & parenthesis Also rename (salsa20\|chacha20)Internal() to a better name. And sort reexported crypto. names	2020-10-28 21:43:15 +02:00
Žiga Željko	7c2bde1f07	std/crypto: API cleanup	2020-10-26 19:19:34 -04:00
Frank Denis	74a1175d9d	std/*: add missing MIT license headers	2020-10-26 17:41:29 +01:00
Frank Denis	72064eba23	std/crypto: vectorize BLAKE3 Gives a ~40% speedup on x86_64. However, the generic code remains faster on aarch64. This is still processing only one block at a time for now. I'm pretty confident that processing more blocks per round will eventually give a substantial performance improvement on all platforms with vector units.	2020-10-25 21:13:14 -04:00
Frank Denis	1b4ab749cf	std/crypto: add the bcrypt password hashing function The bcrypt function intentionally requires quite a lot of CPU cycles to complete. In addition to that, not having its full state constantly in the CPU L1 cache causes a massive performance drop. These properties slow down brute-force attacks against low-entropy inputs (typically passwords), and GPU-based attacks get little to no advantages over CPUs.	2020-10-25 21:11:40 -04:00
Frank Denis	0c7a99b38d	Move ed25519 key pairs to a KeyPair structure	2020-10-25 21:55:05 +01:00
Frank Denis	28fb97f188	Add (X)Salsa20 and NaCl boxes The NaCl constructions are available in pretty much all programming languages, making them a solid choice for applications that require interoperability. Go includes them in the standard library, JavaScript has the popular tweetnacl.js module, and reimplementations and ports of TweetNaCl have been made everywhere. Zig has almost everything that NaCl has at this point, the main missing component being the Salsa20 cipher, on top on which NaCl's secretboxes, boxes, and sealedboxes can be implemented. So, here they are! And clean the X25519 API up a little bit by the way.	2020-10-25 18:04:12 +01:00
Frank Denis	91a1c20e74	Fix a typo (s/multple/multiple/)	2020-10-24 07:57:34 +02:00
Frank Denis	047599928a	Add a benchmark for signature verifications	2020-10-22 09:58:26 +02:00
Frank Denis	2d9befe9bf	Implement multiscalar edwards25519 point multiplication	2020-10-22 09:58:26 +02:00
Frank Denis	0fb6fdd7eb	Support variable-time edwards25519 scalar multiplication This is useful to save some CPU cycles when the scalar is public, such as when verifying signatures.	2020-10-22 09:58:26 +02:00
Frank Denis	ff658abe79	std/crypto/25519: use Barrett reduction for scalars (mod l)	2020-10-22 09:58:26 +02:00
Frank Denis	8e79b3cf23	std/crypto/25519: add support for batch Ed25519 signature verification	2020-10-22 09:58:26 +02:00
Frank Denis	fa17447090	std/crypto: make the whole APIs more consistent - use `PascalCase` for all types. So, AES256GCM is now Aes256Gcm. - consistently use `_length` instead of mixing `_size` and `_length` for the constants we expose - Use `minimum_key_length` when it represents an actual minimum length. Otherwise, use `key_length`. - Require output buffers (for ciphertexts, macs, hashes) to be of the right size, not at least of that size in some functions, and the exact size elsewhere. - Use a `_bits` suffix instead of `_length` when a size is represented as a number of bits to avoid confusion. - Functions returning a constant-sized slice are now defined as a slice instead of a pointer + a runtime assertion. This is the case for most hash functions. - Use `camelCase` for all functions instead of `snake_case`. No functional changes, but these are breaking API changes.	2020-10-17 18:53:08 -04:00
Frank Denis	0b4a5254fa	Vectorize Gimli	2020-10-16 18:41:11 -04:00
Frank Denis	51a3d0603c	std.rand: set DefaultCsprng to Gimli, and require a larger seed `DefaultCsprng` is documented as a cryptographically secure RNG. While `ISAAC` is a CSPRNG, the variant we have, `ISAAC64` is not. A 64 bit seed is a bit small to satisfy that claim. We also saw it being used with the current date as a seed, that also defeats the point of a CSPRNG. Set `DefaultCsprng` to `Gimli` instead of `ISAAC64`, rename the parameter from `init_s` to `secret_seed` + add a comment to clarify what kind of seed is expected here. Instead of directly touching the internals of the Gimli implementation (which can change/be architecture-specific), add an `init()` function to the state. Our Gimli-based CSPRNG was also not backtracking resistant. Gimli is a permutation; it can be reverted. So, if the state was ever leaked, future secrets, but also all the previously generated ones could be recovered. Clear the rate after a squeeze in order to prevent this. Finally, a dumb test was added just to exercise `DefaultCsprng` since we don't use it anywhere.	2020-10-15 20:57:16 -04:00
Frank Denis	cb44f27104	std/crypto/hmac: remove HmacBlake2s256 definition HMAC is a generic construction, so we allow it to be instantiated with any hash function. In practice, HMAC is almost exclusively used with MD5, SHA1 and SHA2, so it makes sense to define some shortcuts for them. However, defining `HmacBlake2s256` is a bit weird (and why specifically that one, and not other hash functions we also support?). There would be nothing wrong with that construction, but it's not used in any standard protocol and would be a curious choice. BLAKE2 being a keyed hash function, it doesn't need HMAC to be used as a MAC, so that also doesn't make it a good example of a possible hash function for HMAC. This commit doesn't remove the ability to use a Hmac(Blake2s256) type if, for some reason, applications really need this, but it removes HmacBlake2s256 as a constant.	2020-10-15 20:50:34 -04:00
Frank Denis	f3667e8a80	std/crypto/25519: do cofactored ed25519 verification This is slightly slower but makes our verification function compatible with batch signatures. Which, in turn, makes blockchain people happy. And we want to make our users happy. Add convenience functions to substract edwards25519 points and to clear the cofactor.	2020-10-15 18:49:10 -04:00
Frank Denis	9f109ba0eb	Simpler ChaCha20 vector code	2020-10-10 22:45:41 +02:00
Frank Denis	459128e059	Use an array of comptime_int for shuffle masks Suggested by @LemonBoy - Thanks!	2020-10-10 22:45:41 +02:00
Frank Denis	9b386bda33	std/crypto: add a vectorized ChaCha20 implementation Brings a 30% speed boost on x86_64 even though we still process only one block at a time for now. Only enabled on x86_64 since the non-vectorized implementation seems to currently perform better on some architectures (at least on aarch64). But the non-vectorized implementation still gets a little speed boost as well (~17%) with these changes.	2020-10-10 22:45:41 +02:00
Andrew Kelley	b02341d6f5	Merge pull request #6614 from jedisct1/aes-arm std/crypto/aes: add AES hardware acceleration on aarch64	2020-10-08 18:09:40 -04:00
Frank Denis	1bc2b68916	ghash: add pmull support on aarch64	2020-10-08 18:09:23 -04:00
Frank Denis	60d1e675d2	aes/aesni is not based on a Go implementation, only aes/soft is Don't blame them for our bugs :)	2020-10-08 14:55:11 +02:00
Frank Denis	f39dc00ed4	std/crypto/aes: add AES hardware acceleration on aarch64	2020-10-08 14:55:08 +02:00
Frank Denis	fb63a2cfae	std/crypto: faster (mod 2^255-19) square root computation 251 squarings, 250 multiplications -> 251 squarings, 11 multiplications	2020-10-06 19:48:26 -04:00
Frank Denis	06c16f44e7	std/crypto: Add support for AES-GCM Already pretty fast on platforms with AES-NI, even though GHASH reduction hasn't been optimized yet, and we don't do stitching either.	2020-10-06 00:00:33 +02:00
Frank Denis	d343b75e7f	ghash & poly1305: fix handling of partial blocks and add pad() pad() aligns the next input to the first byte of a block, which is useful to implement the IETF version of ChaCha20Poly1305 and AES-GCM.	2020-10-05 23:50:38 +02:00
Andrew Kelley	8170a3d574	Merge pull request #6463 from jedisct1/ghash std/crypto: add GHASH implementation	2020-10-04 02:46:36 -04:00
Frank Denis	97fd0974b9	ghash: add pclmul support on x86_64	2020-10-01 02:05:11 +02:00
Frank Denis	8161de7fa4	Implement ghash aggregated reduction Performance increases from ~400 MiB/s to 450 MiB/s at the expense of extra code. Thus, aggregation is disabled on ReleaseSmall. Since the multiplication cost is significant compared to the reduction, aggregating more than 2 blocks is probably not worth it.	2020-10-01 02:05:07 +02:00

1 2 3 4

165 Commits