## Sunday, July 16, 2006

.

### Digital signatures, part 3

This is a continuation of the Digital Signatures overview series.

#### 2.2  Public-Key Encryption

In the mid-1970s, Whitfield Diffie and Martin Hellman refined and published a practical method for generating and using asymmetric keys for encryption. An alternative method, known as RSA was published (and patented) a couple of years later by a team led by Ron Rivest, and it is this method, now that the patent has expired, which is in common use today.

The theory is based on pairs of keys generated from large prime numbers, along with an algorithm that uses one key for encryption and the other for decryption. And, in fact, it works both ways, such that the following two formulas are both true:

D(k2, E(k1, tp)) → tp
D(k1, E(k2, tp)) → tp
We keep one of the keys secret (this is important!) and call it the private key, and we freely and widely publish the other key and call it the public key.

If I want to send you a message that only you can read, I use your public key to encrypt it. Anyone can do that, because your public key is freely available, but only you can decrypt it because only you have your private key.

Similarly, if I want to send you a message that you can be sure I sent, I can encrypt it with my private key, which only I have access to. Anyone can read this message, because anyone can get my public key and decrypt it — so it isn't a secret message — but anyone who reads it knows for sure that I was the one who encrypted it (assuming that my private key has been kept secure). This is the part we use to create digital signatures.

What this gives us in terms of key management is significant. Key distribution can be done publicly, and every user need have only one key pair (or one pair per role; I might want one pair that I use for business and another for personal use, for example). Ten people can communicate securely with each other in any combination using only ten key pairs, rather than 1013 shared symmetric keys.

As it turns out, these asymmetric-key algorithms are much slower and more costly to implement than are the symmetric-key algorithms, in part because much longer keys (much larger numbers) are required. So we do not actually encrypt the entire message this way. To send an encrypted message, we create a symmetric key and encrypt the message with that, and then we encrypt that key using the public-key algorithm and send it along with the cipher-text message. The recipient first decrypts the key, then uses that key to decrypt the message.

When we create a digital signature, rather than encrypting the entire message to prove that we sent it, we create a hash (also called a digest) of the message, and we encrypt that; the message is sent in plain text and can be read, but the recipient can decrypt the hash and use it to verify that the message has not been tampered with.

Of course, if we need privacy and a signature on the same message, we can both digitally sign and encrypt the message at the same time.

#### 2.3  Hash Functions

The purpose of a hash function is to create a small chunk of data (the hash value, or just hash) that represents a longer message. The important properties of the hash function are that the hash value must be relatively small and that any change to the message, however minor, will almost certainly result in a different hash value.

Consider the following algorithm: think of each character of the message as a number from 0 to 255, and add the values of each character together, then divide by 9 and use the remainder as the hash value. The hash value is a number from 0 to 8, so it's very small. And any change at all in the original message will have an 8-in-9 chance (89%) of changing the hash value. If we used such a hash function, we would know with 89% certainty whether the message had been tampered with. Of course, that means that there's an 11% chance that it could be altered undetectably.

Real hash functions — current ones are MD5, SHA-1, and SHA-256 — create hash values that are longer than those few bits in our trivial example — 128 bits, 160 bits, and 256 bits, respectively — and use algorithms that are proven to be effective using the mathematics and the computer processing power that was current at the time. Discussion of hash attacks is beyond our scope, but practical weaknesses have been demonstrated against MD5, and theoretical ones against SHA-1, so things will be changing in this area over the next few years. That said, these hash functions are still useful for generating digital signatures today.

Next time: Digital Signatures