What is a SHA-1 Collision Attack? | NuHarbor SecurityBy: Justin Fimlaid

What is SHA-1 and what is the history of SHA-1? Originally SHA-1 was developed as part of a U.S. government capstone project. The first version of SHA was SHA-0 and that was developed in 1993 as the Secure Hash Standard. SHA-0 was originally published as the Federal Information Processing standard publication 180. FIPS Publication 180 it was quickly withdrawn by the government and replaced in 1995 by SHA-1 which became FIPS Publication 180-1 and was then named the Secure Hash Algorithm. So that’s where the term SHA actually comes from.

The idea of this Secure Hash Algorithm is considered to be a one way compression function. It was initially intended as a security function validating the integrity of a message. Some of the ways that the SHA algorithm is used today, or other like compression algorithms or are one way hashes, is that they’re used to:

  • validate digital certificate signatures
  • they validate email PGP or GPG signatures
  • they validate software vendor signatures
  • they validate software updates
  • hey validate check sums
  • they validate backups

SHA hashes are also commonly used by systems such as GIT or subversion that are using hashes to validate the consistency of the software for the purposes of change management. The way this works is if you check out a piece of software code with a certain SHA value or hash value, then make changes and resubmit that version of the code it will have a different hash value or SHA value. This indicates to GIT or Subversion that a change has been made to the release even if it’s only a single character with in that specific code change. That is how key systems such as GIT or Subversion actually track whether a change has been made.

So let’s start getting into the technical explanation of SHA-1 collision attacks. For the attacks that have been successfully proven to date and that can be mathematically verified as a true collision exploit, those attack examples start from identical chaining values for two messages and using two pairs of blocks.

The difference in the first block pair caused a small difference in the output of the chaining value which is then cancelled by the difference in the second block pair leading to identical chaining values and hence thereby a collision. If you employed differential paths that are a precise description of differences in the state words and message words and how these different should propagate through the 80 steps needed to complete a short one calculation.

Although the first five state words are fixed in a SHA-1 changing value one can freely modify the message words of the computation and thus directly influence the next 16 state words the key solution is the concept of local collisions whereby any state bit difference introduced by the purge abuse and message. Big difference is to be canceled in the next five steps using correction message.

In order for this to work, the equation relies on what’s called the disturbance vector. The disturbance vector that is used to correctly calculate the expanded message the disturbance vector is one of the key pieces of this equation. Every time we’re a local collision has been identified by a certain benchmark the selection of a good disturbance vector is very important to the overall attacks success the main reason for using a two block pair instead of just using a one pair is that the choice alleviates a more important restriction on the disturbance factor namely is that there is no state differences in the final steps of the attack.

In the most simple explanation I can offer.  Think of a SHA-1 value chain as a series of parts of which some can be changed, others can’t be changed otherwise the whole value chain will be materially changed.

The way the attack works, if you split a chain into two parts or blocks you can modify the first block and compensate using the second block.  Say your hash value is 6. If your original first block has a value of 4 and your second original block has a value of 2–equaling 6.  If you modify your first block to a value of 4 and your second value to 2, the resulting value is still 6.

To execute this to exploit a SHA-1 hash requires a complex mathematical algorithm. Everything the computation runs and doesn’t match it’s a “collision”. To exploit a SHA-1 hash, it could take millions of millions collisions to find the right computational matches.

Now, while this is technically doable and has been proven as such, the compute needed to run this type of attack would be expensive and take a long time (think many years).

At this point, the risk is low of exploit and the cost benefit ratio of one SHA-1 collision attack would be questionable.

If you are interested in reading more about SHA-1 Collision Attacks, here are some good starting resources:

SHA-1 Collision Explanation Page: https://shattered.io/

Malicious Hashing: Eve’s Variant of SHA-1 https://link.springer.com/content/pdf/10.1007%2F978-3-319-13051-4_1.pdf

Finding SHA-1 Characteristics: General Results and Applications: https://link.springer.com/chapter/10.1007%2F11935230_1