Analyze your hash functions: The Avalanche Metrics Calculation

Arish Sateesan
5 min readJul 6, 2020

--

The most desirable property of a block cipher or a cryptographic hash function is the Avalanche effect. Avalanche Criterion and Strict Avalanche Criterion are the most commonly used metrics to analyze a hash function. There are other metrics such as Bit Independence and Entropy to test the randomness of a hash function. But how do you perform these tests to check if a hash function satisfies the desirable properties?

The Avalanche criterion and strict avalanche criterion are well known. We can define these criteria as:
Avalanche: An average of one-half of the output bits should change whenever a single input bit is complemented.
Strict Avalanche: Each bit should have a 50% probability to change for every single bit change of the input.
We have seen the equations for determining the avalanche co-efficient and it might have seemed very difficult. Well, it isn’t! It is fairly easy to calculate the avalanche co-efficient.

Calculation of Avalanche Co-efficient:

Steps to Calculate Avalanche co-efficient:
Let F(x) be the hash function.
1. Generate a random number A1 having a size equal to the number of input bits.
2. Calculate the hash value H1=F(A1).
3. Toggle any bit of A1 randomly to generate A2.
4. Calculate the hash value H2=F(A2).
5. Compute X=H1 xor H2.
6. Calculate the number of set bits N in X.
7. Avalanche co-efficient K=N/n, where n is the number of bits in the output.
8. Repeat steps 1 to 7 for M number of times and calculate the mean of all K. (Take the value of M sufficiently large, say 250,000).

Calculation of Avalanche Metrics:

The strict avalanche criterion can be calculated in a similar way by considering the probability of each bit rather than the average of all bits. Let’s follow a different way to calculate the same.
Let’s introduce an Avalanche Probability Vector P. The avalanche probability vector gives the probabilities of each bit change at the output.
Steps to calculate Probability Vector P:
The first steps to calculate X are similar to the steps explained above. Once we have calculated X, follow the pseudo-code below to calculate the count vector COUNT and Probability vector P. To ensure proper results, make sure that the value of M is large enough(Eg: 250,000).

Where n is the number of bits in the output.

P provides the probabilities of each bit change at the output for a single bit change at the input (which in fact is the strict avalanche vector). From the Avalanche Probability Vector P, we can obtain three avalanche metrics: Avalanche dependence, avalanche weight, and avalanche entropy.

Avalanche Dependence: It is the number of output bits that may flip for a single-bit change at the input. It is defined as:

The pseudo-code to calculate the bit Dependence is:

Avalanche Weight: It is the expected Hamming weight of the output difference. It is defined as:

The Pseudo-code to calculate Avalanche Weight is:

Avalanche Entropy: It is the uncertainty about whether output bits flip or not. It is defined as entropy:

The Pseudo-code to calculate Avalanche Entropy is:

These three avalanche metrics shows whether the function meets the desired avalanche properties.

Always take the worst-case scenario:

To make sure that the hash function meets the avalanche criteria, it is recommended to consider the worst-case scenario. In the above steps, we calculated the mean of a single bit change for M iterations. To calculate the worst-case scenario:

  • M iterations are performed for each of the k bits. ( k is the number of bits in the input). ie; First bit of all the random inputs are changed for M iterations. All 3 avalanche metrics are calculated.
  • Compare the calculated avalanche metrics and check if it is lower than the existing metrics. (All the metric values are initialized to maximum in the beginning). The minimum value is kept.
  • Then the second bit of all the random inputs is changed for another M iterations. And the process of calculating the avalanche metrics and taking the minimum is repeated. ( So, a total of k*M iterations).

The pseudocode is shown below.

Where Metric stores the overall avalanche metrics. metric_temp stores the avalanche metrics for M iterations of a particular bit change.

While calculating the worst-case scenario, a small change has to be applied while calculating the probability vector. In the earlier section, A2 was calculated by changing any random bit, but now the Tth bit has to be changed as shown in the pseudocode below.

Reference: J. Daemen, S. Hoffert, G. Van Assche, and R. Van Keer. The design of Xoodoo and Xoofff. IACR Transactions on Symmetric Cryptology, pages 1–38, 2018.

A. Sateesan, J. Vliegen, J. Daemen and N. Mentens, “Novel Bloom filter algorithms and architectures for ultra-high-speed network security applications,” 2020 23rd Euromicro Conference on Digital System Design (DSD), 2020, pp. 262–269.

--

--