According to the ASN.1 BER spec, we should be encoding
all sequences (including empty ones) as constructed:
8.9.1 The encoding of a sequence value shall be constructed.
8.10.1 The encoding of a sequence-of value shall be constructed.
8.11.1 The encoding of a set value shall be constructed.
8.12.1 The encoding of a set-of value shall be constructed.
However, we were only setting them as constructed when the
list was non-empty.
This changes it, and makes letsencrypt happy with the CSRs that
we generate.
We need a way to parse a rsa certificate request and return the public
key and subject names. The new function X509reqtoRSApub() works the
same way as X509toRSApub() but on a certificate request.
We also need to support certificates that are valid for multiple domain
names (as tlshand does not support certificate selection). For this
reason, a comma separated list is returned as the certificate subject,
making it symmetric to X509rsareq() handling.
A little helper is provided with this change (auth/x5092pub) that takes
a certificate (or a certificate request when -r flag is provided) and
outputs the RSA public key in plan 9 format appended with the subject
attribute.
when trying to request certificates from letsencrypt,
their test api would reject our csr because of
"tuncated sequence" unless we force subectAltName
by passing multiple domains (as comma separated list).
apparently, we need to provide the context specific tag
"cont [ 0 ]" for the extensions even when we do have
any extensions for the csr (triggered when we need to
have subjectAltNames).
for this, we change mkcont() to take a Elist* instead,
which then can be nil when not used. also put the tag
number argument first, which makes it easier to read.
As checking for all zero has to be done in a timing-safe
way to avoid a side channel, it is best todo this here
instead of letting the caller deal with it.
This adds a return type of int to curve25519_dh_finish()
where returning 0 means we got a all zero shared key.
RFC7748 states:
The check for the all-zero value results from the fact
that the X25519 function produces that value if it
operates on an input corresponding to a point with small
order, where the order divides the cofactor of the curve.
1. add the curve x25519 to tls, both client and server.
it's more faster, immune to timing attacks by design,
does not require verifying if the public key is valid,
etc etc. server-side has to check if the client supports
the curve, so a new function has been introduced to parse
the client's extensions.
2. reject weak dhe primes that can be easily cracked with
the number field sieve algorithm. this avoids attacks like
logjam.
3. stop putting unix time to the first 4 bytes of client/
server random. it can allow fingerprinting, tls 1.3 doesn't
recommend it any more and there was a draft to deprecate
this behaviour earlier.[1]
4. simply prf code, remove useless cipher enums.
[1] https://datatracker.ietf.org/doc/html/draft-mathewson-no-gmtunixtime-00
kvik writes:
I needed to convert the RSA private key that was laying around in
secstore into a format understood by UNIX® tools like SSH.
With asn12rsa(8) we can go from the ASN.1/DER to Plan 9 format, but not
back - so I wrote the libsec function asn1encodeRSApriv(2) and used it in
rsa2asn1(8) by adding the -a flag which causes the full private key to be
encoded and output.
Instead of only using a hash over the whole certificate for
white/black-listing, now we can also use a hash over the
Subject Public Key Info (SPKI) field of the certificate which
contians the public key algorithm and the public key itself.
This allows certificates to be renewed independendtly of the
public key.
X509dump() now prints the public key thumbprint in addition
to the certificate thumbprint.
tlsclient will print the certificate when run with -D flag.
okCertificate() will print the public key thumbprint in its
error string when no match has been found.
getting rid of some functions that take Byte* and instead
pass uchar* and length.
keeping the signature and public key fields in CertX509
as Bits* allows ownership transfer by swapping pointers.
use common code to copy CN from subject field.
- unroll the loops
- rotate the taps on each step, avoiding copies
- simplify boolean formulas for Ch() and Maj()
this yields arround 40% throughput increase on 32/64bit
archs for sha2_256 and sha2_512 on amd64.
- get rid of the temporary copies and memmoves()
- when the data pointer is aligned, do xor and copying inline
speedup for auth/aescbc encryption depends on arch:
- zynq 7% (arm)
- t23 13% (386)
- x230 20% (amd64, aes-ni)
- apu2 25% (amd64, aes-ni)
doing 4 quarterround's in parallel using 128-bit
vector registers. for second round shuffle the columns and
then shuffle back.
code is rather obvious. only trick here is for the first
quaterround PSHUFLW/PSHUFHW is used to swap the halfwords
for the <<<16 rotation.
Add assembler versions for aes_encrypt/aes_decrypt and the key
setup using AES-NI instruction set. This makes aes_encrypt and
aes_decrypt into function pointers which get initialized by
the first call to setupAESstate().
Note that the expanded round key words are *NOT* stored in big
endian order as with the portable implementation. For that reason
the AESstate.ekey and AESstate.dkey fields have been changed to
void* forcing an error when someone is accessing the roundkey
words. One offender was aesXCBmac, which doesnt appear to be
used and the code looks horrible so it has been deleted.
The AES-NI implementation is for amd64 only as it requires the
kernel to save/restore the FPU state across syscalls and
pagefaults.
the previous implementation was not portable at all, assuming
little endian in gf_mulx() and that one can cast unaligned
pointers to ulong in xor128(). also the error code is likely
to be ignored, so better abort() when the length is not a
multiple of the AES block size.
we also pass in full AESstate structures now instead of
the expanded key longs, so that we do not need to hardcode
the number of rounds. this allows each indiviaul keys to
be bigger than 128 bit.
initThumbprints() now takes an application tag argument
so x509 and ssh can coexist.
the thumbprint entries can now hold both sha1 and sha256
hashes. okThumbprint() now takes a len argument for the
hash length used.
the new function okCertificate() hashes the certificate
with both and checks for any matches.
on failure, okCertificate() returns 0 and sets error string.
we also check for include loops now in thumbfiles, limiting
the number of includes to 8.
when converting mpint to bytes, always pad it to the size of
the modulus (RSA,DHE,ECDHE). mptobytes() now takes a byte len
parameter which the caller usually calculates from the group
modulus using mpsignif(). this bug sometimes caused "bad record mac"
after the handshake.
use a shared buffer, given that msgSend()/msgRecv() don't overlap
we can use the first half for sending, and the top half for
receiving, shifting down as neccesary. the space beween sendp and
recvp is free.
explicitely check for overflow in msgSend().
reverting asn1mpint() as all users really just expect
unsigned integers here. also openssl seems to interpret
rsa modulus as unsigned no matter what... so keeping
it as it was before.
handle nil cipher bytes in factotum_rsa_decrypt() due
to pkcs1padbuf() failing.
apply some lessions from intels berzerk paper:
instead of parsing the decrypted digest info blob, we
generate the *expected* blob's for all digest algorithms
that match the digest size and compare the results.
provide pkcs1 pad and unpad functions that consistently
enforce minimum padding size and handles block types 1
and 2.
quick fix is to bias the rounding so the msb will always
be zero. should write proper conversion code to actually
deal with signed mpints... also for asn1mpint()... -- cinap
given that we only pass uchar* with constant offsets
to the s and d arguments of ENCRYPT(), we do not need
the temporary variables sp/dp and the compiler is
smart enougth to combine the const offset with the ones
from GET4() and PUT4() and emit single load and store
instructions for the byte accesses.