Cryptographic hash functions are used to avoid handling passwords in plain text. They are also used to verify data integrity, e.g., downloaded files (software). Hash functions are used to generate a message digest which is then encrypted using a private key.
A hash function converts the input of any length to a fixed length output, called message digest.
Requirements of a hash function:
- input can be any length
- output length is fixed
- relatively easy to compute
- one way (hash functions are also called one-way functions)
- collision-free – it is very difficult to find two strings which can generate the same hash value
Although this is not a requirement, hash functions are fast/should not be slow.
Some of the popular hash functions are MD5, SHA1 and SHA2.
Use ruby to generate a hash. In IRB:
require 'digest/sha2' h = Digest::SHA2.new << 'string' => #<Digest::SHA2:256 473287f8298dba7163a897908958f7c0eae733e25d2e027992ea2edc9bed2fa8> p h.class => Digest::SHA2
A common hash function SHA256 generates a hash which is 256 bits long. Each hex digit has 4 bits and you need 64 hex characters to store the digest.
p h.to_s.length => 64
Cross check with an online calculator at http://www.hashgenerator.de/
Check the hash for other values such as ‘string1’, ‘grinst’, ‘string2’
Can you modify the string to generate a given hash? Can you ‘guess’ the hash?
Can you crack a hash?
When the input to a hash function is restricted, you can compute the hash for the all the input values and compare the output with the computed values. e.g., Encrypt a IP address using MD5. There are 256 values. Each of these can be computed in advance to understand the values. (Introduction to Network security – Neal Krawetz)
Using a ‘salt’
One way to crack hashes is to compute the hash of common words and see if they match the stored hash (dictionary attack). To prevent that you can use a ‘salt’. A salt is a random string which is pre-pended to each string to be hashed. This adds another layer of complexity to guessing the hash.
Which hash function should you use?
Weaknesses have been found in MD5 and SHA1. At this point (May 2012) you should make sure you are using SHA2.
Homework: How is encryption different from hashing? When would you use encryption instead of hashing?
Testing hash functions
If your application (web or desktop) uses passwords or other vulnerable information such as credit cards, you should check how the data is stored. Start by tracing the flow of the relevant data through the entire application (data tour). If you have been doing blackbox testing, you may not have access to source and might face resistance when discussing the flow of data through the application. To make an impact on testing security you should be able to discuss the flow of data through an application
If you are creating a security product which uses a hash function, e.g., allows administrators to use a hash function to create message digests, you may want to pay more attention to the correctness of the hash.
You should spend some time cross checking values with an online calculator.
You should trace how the original input value is stored and when the hash is computed.
When the user enters the value again, how is that compared with the stored value?
When working with security you often encounter complex algorithms. Don’t be intimidated by the complexity – as a tester you need to focus on what is important. It is less important to validate the actual algorithm. Although don’t ignore this. You are responsible for validating the results. It is more important to examine what type of data is stored in the application and how it is stored.
Note that calculating hash values of incorrect length or which are incorrect will leave you/the organization looking very stupid. (tester beware)
Are you using the correct standard?
When working with security algorithms, most teams work with standard algorithms. However, these algorithms are always undergoing scrutiny and change/refinement. You should check if you are using algorithms which don’t have any published flaws. Note that these algorithms are complex and you don’t need to understand the actually weakness. You can just search for opinion on the algorithm and any concerns. Should you be using the newer algorithm which was recently published?
Are you vulnerable if you don’t use the most recent algorithm?
It is unlikely that you will be vulnerable in the case of an attack if you use MD5 or SHA1 (along with a salt and other precautions). However, if you use SHA2, no one will find fault with you.
Use of third party libraries
When working with security algorithms you should check how the algorithm is implemented. Is it part of the standard library provided with the compiler or are you using a third party component. In both cases, is the implementation certified?
When hash functions are used to encrypt passwords or other user information, you should check how the data is stored and retrieved from a database. Are there performance implications? You should be able to create a very large number of the items to check performance.
In general when working with hash functions, you should plan to create large amount of data and check against an oracle, such as online calculator or a simple function in ruby or python.
You should have access to password lists and incorporate that into your testing.
These are some terms you should know when working with hash functions:
- Parity check – used to detect errors in memory or communication. Add the number of bits and check with a parity bit.
- CRC – Cyclic redundancy checks are used to detect changes to input, such as network data. Also used as a synonym for hash
- checksums – generate fixed length data from input. They are generally simpler than hash functions. The hash value is also called a checksum
- Non-repudiation – digital signatures are signed using the sender’s private key and can be used to establish ownership. Message authentication codes use a shared public key and do not have the property of non-repudiation.
- rainbow tables – save long chains of hashed passwords, saving on storage compared to a dictionary attack
- dictionary attack – tries all the words in a list called a dictionary
- brute force – tries every possible combination of characters
- MD – Message Digest algorithm
- SHA – Secure hash algorithm
- one way encryption
- data integrity
- message digest
Spend time playing around with creating hashes in ruby. You should know the length of the digest for different hash functions. The only way to build confidence is to work with these functions instead of reading about them.