Abstract. Secure hash functions are the unsung heroes of modern cryptography. Introductory courses in cryptography often leave them out -since they don't have a secret key, it is difficult to use hash functions by themselves for cryptography. In addition, most theoretical discussions of cryptographic systems can get by without mentioning them. However, for secure practical implementations of public-key ciphers, digital signatures, and many other systems they are indispensable. In this paper I will discuss the requirements for a secure hash function and relate my attempts to come up with a "toy" system which both reasonably secure and also suitable for students to work with by hand in a classroom setting.
Hash FunctionsHash functions are an essential part of modern cryptographic practice. At their root, however, hash functions don't necessarily have anything to do with cryptography, or even secrecy. By definition, a hash function is any function which takes an arbitrarily long string of characters or bits as input and returns a fixed-length output. A simple example using characters from the Roman alphabet is given in [2, Example 3.6.1, p. 233]: write the string in rows of five letters each (padding if necessary) and convert each letter to its numerical equivalent. Then add down the columns modulo 26 and convert the result back to letters. For example, if the input string is "Hello, my name is Alice", the procedure would go:H E L L O → 07 04 11 11 14 M Y N A M → 12 24 13 00 12 E I S A L → 04 08 18 00 11 I C E X X → 08 02 04 23 23 05 12 20 08 08 ↓ ↓ ↓ ↓ ↓ F M U I IThe output of the hash function, or hash value, is always five letters long, no matter how long the input is. In this case, "FMUII" would be the output. When students first see hash functions, there are two common misconceptions that they often start with. First, hash functions are not encodings of the original string, in the sense that it is not possible to recover the string from the hash function, no matter how long you work at it or how clever you are. Since hash value has a fixed length while the input can be arbitrarily long, there simply isn't enough information to recover the original input. The second point that needs to be made is that hash functions are not in any way secret. There is no key to a hash function and anybody can compute the hash value given the input string.Hash functions are used for purposes other than cryptography. In database design, for example, hash functions are used to convert a search key into a convenient index into a table where the record with that key is stored. A checksum is a hash function used to detect accidental errors introduced into a message or stored record. The hash function shown above makes a good checksum, since any single error will change the result, and two errors that canceled each other out would have to occur a multiple of five characters apart in the original message. It is rather unlikely that such a error would occur by accident! There are two properties that are convenient for hash functions in...