How SHA (Secure Hash Algorithm) works?

7 min readJul 27, 2020

Any data can be hashed, regardless of its size or type and length, and the hash which any data generates using a hashing algorithm is always of the same length.

If you have never heard about SHA or Secure Hashing Algorithms, I will try to explain from the word “Hashing”. Hashing means a process to convert a given key to a certain value. A hash function is one that can be used to map data with different size to an array of bit with fixed size. It is a one-way function , which is make it infeasible to be inverted.

Hash is convenient for situations where computer need to identify, compare or run calculations againts files or strings of data. Hashing algorithm are used in many applications on real worlds, they are used for storing password, in computer vision, in database, etc. In the following discussion, I will focus more on our primary discussion, which is SHA algortihms, a cryptographic hash security.

This is a cyrptographic hash function, as we can see one word (“Fox”) turn into mix number and character. This is the so-called avalanche-effect.

Secure Hash Algorithm, or mostly know as SHA, are a cryptographic functions which is utilized to make our data secured. It works using a hash function such as an algorithm that comprises of bitwise operations, modular additions, and compression functions. This algorithm developed by National Institute of Standards and Technology along with NSA, previously released as a Federal Information Processing Standard, later in 1995, it was named SHA algorithm.

In above picture, there is a diagram depicts an avalanche-effect in which the term is associated with a specific behaviour of mathematical function used for encryption algorithm. In case of algorithm that uses hash value, even a small alteration in an input string should drastically change the hash value. In other words, flipping single bit in input string should at least flip half of the bits in the hash value. To implement a strong cipher or cryptographic hash function, this should be considered as one of the primary design objective.

| Characteristics of SHA

Cryptographic hash functions are utilized in order to keep data secured by providing three fundamental of characterictics which is consist of : pre-image resistance, second pre-image resistance, and collision resistance.

The pre-image resistance is important to clear off brute force attacks from a set of huge and powerful machines. The second resistance technique makes the attacker has to go through an enormous time on decoding the error message even though the first level has been decrypted. And the last but not least, collision resistance which is the most difficult thing to crack because the attacker need to find two completely different messages which hash to the same thing.

| Types of SHA

There are many types of SHA, some of those family are SHA-0, SHA-1, SHA-2, SHA-3 and SHA-256, each of which was succeded increasingly stronger encryption and still being updated in response to hacker attack. SHA-0, for example, this types is now obsolote due to widely exposed to the world. Because there are too many types of SHA algorithms, in this article I will just point out few of those types.

SHA - 0

SHA-0 is initially published in 1993, this algorithm hashes messages of any length in blocks of 512 bits, and produces a messages digest of 160 bits. However, it’s identified as inappropriate to use this by U.S National Institute of Standard and Technology (NIST) for any transactions associated with the use of cryptography. There is many attacks that causes NIST no longer consider it usable and withrew it.

Collision Resistance
First attack of this algorithm is published in 1998 and showed that this collision can be identified within 2⁶¹ operations. And and improved attacks in 2006 showed that it can even be indentified within 2³⁶ operations.
Pre-Image and Second Pre-Image Resistance
This attack indicate that the security margin of SHA-0 is resistant to these attacks. It’s also showed that a pre-image attack on 49 out of 80 rounds with complexity of 2¹⁵⁹, and showed a pre-image attack on 52 out of 80 rounds with a complexity of 2¹⁵⁶. Find more about pre-image attack here.
HMAC-SHA-0
The current attack vectors on HMAC can be classified as follows:
distinguishing attacks, existential forgery attacks, and key recovery
attacks. Key recovery attacks are by far the most severe. Attacks on hash functions can be conducted entirely offline, since the attacker can generate unlimited message-hash value pairs. Attacks on HMACs must be online because attackers need a large amount of HMAC values to deduce the key. The best results for a partial key recovery attack on HMAC-SHA0 were published at Asiacrypt 2006 with 2⁸⁴ queries and 2⁶⁰ SHA-0 computations

Here is the algorithm of how SHA-0 works:

Pad the message to be hashed by adding a 1, an appropriate number of 0 and the 64 bits integer representing the length of the message. After this padding operation, the message is formed of a integral number of 512 blocks.
Initialize 5 registers of 32 bits A, B, C, D and E with fixed constants:

A = 0x67452301

B = 0xEFCDhB89

C = 0x98BADCFE

D = 0xi0325476

E = 0xC3D2EIF0

3. For each message block, copy A, B, C, D and E respectively in AA, BB, CC, DD and EE. Apply the compression function to AA, BB, CC, DD, EE and the message block. This yields AA’, BB’, CO’, DD’ and EE’. These 5 values are then added respectively to A, B, C, D and E.

4. Output the concatenation of A, B, C, D and E.

SHA -1

As most dedicated hash functions used today, SHA-1 is based on the design principles of MD4.

Here is the algorithm of how SHA-1 works:

The input message is padded and split into 512-bit message blocks.
An 80-step compression function is then applied to each of this blocks. It has two types of inputs, the first one is a chaining input of 160 bits and the second one is message input of 512 bits.
Let g(m,h) denote the compression function with message input M and chaining input h. The chaining input h (n+1) for the next compression is calculated by h (n) + g(m,h(n)), called feed forward.
The chaining variables for the first iteration are set to fixed values (referred to as IV ).
The result of the last call to the compression function is the hash of the message. The compression function basically consists of two parts: the message expansion and the state update transformation.

In SHA-1, the message expansion is defined as follows. The message is represented by 16 32-bit words, denoted by Mi, with 0 ≤ i ≤ 15. In the message expansion, this input is expanded linearly into 80 32-bit words Wi. The expanded message words Wi are defined as follows:

SHA -256

Sha256 is one of the successor hash functions to SHA-1 (collectively referred to as SHA-2), and is one of the strongest hash functions available. SHA-256 is not much more complex to code than SHA-1, and has not yet been compromised in any way. The 256-bit key makes it a good partner-function for AES. It is defined in the NIST (National Institute of Standards and Technology) standard ‘FIPS 180–4’. NIST also provide a number of test vectors to verify correctness of implementation. There is a good description at Wikipedia.

FIPS 180–4 specifies that the message has a ‘1’ bit appended, and is then padded to a whole number of 512-bit blocks, including the message length (in bits) in the final 64 bits of the last block
Since we have a byte-stream rather than a bit-stream, adding a byte ‘10000000’ (0x80) appends the required bit “1”.
To convert the message to 512-bit blocks, I calculate the number of blocks required, N, then for each of these I create a 16-integer (i.e. 512-bit) array. For each if these integers, I take four bytes from the message (using charCodeAt), and left-shift them by the appropriate amount to pack them into the 32-bit integer.
The charCodeAt() method returns NaN for out-of-bounds, but the ‘|’ operator converts this to zero, so the 0-padding is done implicitly on conversion into blocks.
Then the length of the message (in bits) needs to be appended in the last 64 bits, that is the last two integers of the final block.

| Conclusions

As time flows, attack againts any complex security will get improve, and yet computer processing power will cheaper, For that reason, a good choice of algorithm will be necessarry in term of improvements and a compromise security. It take many years for research and vet new crpytographic standars, also develop a software that support them.

These algorithm are aimed to provide an additional security to the increasing and massive data that you want to secure with. Hackers or any attackers will find a crack in any newer form of hashing techniques human has ever made. All we need to do is just to assure that we are prompt enough to be more secure than letting the prey get all of our data. Hope you like this article and I will write more about algorithm article if you recommend for more.

References
___________________________________________________________________

https://link.springer.com/content/pdf/10.1007%2F11935230_1.pdf
https://link.springer.com/content/pdf/10.1007%2F978-3-540-28628-8_18.pdf
https://link.springer.com/content/pdf/10.1007%2F978-3-540-24654-1_13.pdf
https://www.educba.com/sha-algorithm/
https://www.thesslstore.com/blog/difference-sha-1-sha-2-sha-256-hash-algorithms/
https://www.cryptocompare.com/coins/guides/how-does-a-hashing-algorithm-work/