What is Hashing? What is the Purpose of Hashing?
Let’s say you need to copy a file from one computer to another. How would you ensure that the two files (original and copy) are the same? You can use hashing to do this.
What is Hashing?
But what is hashing and how it works exactly?
A hashing algorithm transforms blocks of data that a file consists of into shorter values of fixed length. In other words, a hash value is basically a summary of what is in that file.
How does this work?
Data is stored in hash tables as key and value pairs. Each hash table has 3 basic operations it can do:
- Insert – to insert a specific value in the hash table.
- Delete – to delete a specific value in the hash table.
- Search – to insert a specific value in the hash table.
A hash key itself is an integer to which a hash function is applied and is used as an address of the hash table. In other words, a hash function maps hash keys to locations (indexes in the table).
However, if more than one hash key maps to the same index, this is called a “hash collision”. What this means is that a hash algorithm could produce the same hash value using two different outputs, which we don’t want. What a good hash algorithm does is that it offers as little collision chance as possible.
Most Common Types of Hashing
A hash algorithm can come in several types, the three most common being:
SHA-2 followed SHA-1, but introduced many new changes. It has six hash functions with hash values of 224, 254, 384 and 512 bits(SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224 and SHA-512/256). This type of hashing was developed by the NSA.
The cyclic redundancy check is typically used to check for file integrity in FTP servers and Zip files. It’s an error-detecting code that looks for accidental data changes.
MD5 is an older type of hashing, which encodes information into a 128-bit fingerprint and is usually used to verify data integrity (as a checksum). However, it is very vulnerable to hash collisions.
This type of hash function is admittedly slower but that’s because it is intended to thwart hackers from cracking passwords by making it more time-consuming and make sure they can’t execute quick attacks.
This hash function works best with small data and its main advantage lies in great speeds compared to other hash functions.
RipeMD is used in the Bitcoin standard (and is used by other cryptocurrencies as well). It is usually available in a 160-bit configuration, but in other multiple-bit configurations as well.
What are the Pros and Cons of Hashing?
Of course, just as hashing has its advantages, it also has its disadvantages. Let’s take a look at the pros and cons of hashing next.
- Hashing can compare two files to see if they are equal without the need to open them side-by-side and compare them word-for-word. This makes hashing much faster over other types of data tables, especially if there’s a large number of different entries.
- Additionally, hashing can also verify the integrity of the copied file in a file backup program to ensure that it was not corrupted during transfer in any way.
- Hashing randomizes data, but doesn’t sort it. This makes it useful for retrieving data, but not very if you want that data to be in any particular order.
- In addition, a malicious 3rd-party might be able to add information to a hash and cause a disproportionate number of collisions in the hash table, which can lead to a DDoS.
What is Salting? How is a Password Salted and Hashed?
You probably heard the term “salted and hashed”. What does this mean? How is a password “salted”?
Salting is a practice of combining each stored password with a randomly generated string of characters first before hashing the results. This string is called a “salt”.
Each user and password will have a unique salt, stored with the hash.
What does salting do?
Salting servers two main purposes in protecting stored passwords.
The first one is that salting prevents certain types of cyberattacks from being executed, or makes them very difficult.
Also, since there’s a high possibility of finding two identical passwords in the database, used by two different users, since the server will generate different salts for each, their resulting hash won’t match.
Hashing vs Encryption Differences
At first glance, hashing does the same thing as encryption, which is to scramble the data so that it becomes unreadable to anyone without the proper key or way to decipher it.
However, there are actually quite a few differences between hashing vs encryption.
One of the biggest differences is that hashing works only one-way, which makes the hashing function irreversible. Encryption, on the other hand, is a two-way process, which makes it reversible.
The purpose of the two, is therefore different as well.
Hashing is used to convert input data into an output hash. This hash has a fixed-length string of characters, called a “hash value” or simply “hash”.
Encryption is used to convert normal text into a ciphertext, which can only be deciphered using a decryption key. This is usually done with either one key that is used to both encrypt and decrypt the message, as in the case of symmetric encryption, or with one key (public) to encrypt) and another to decrypt (private), as in asymmetric encryption.
Here are the differences:
- Hashing is a one-way process.
- Encryption is a two-way process.
- Hashing is used to verify the integrity of the file.
- Encryption is used to determine if the person has the authority to access an encrypted message (does he have the decryption key?).
- Hashing produces a fixed-length hash string.
- Encryption gives a string of variable length.
- The main hashing algorithms are: SHA-2, MD5, BCrypt, CRC32…
- The main encryption algorithms are: AES, RSA, DES, Blowfish…
- Hashing can’t be converted back into the original message.
- Encryption can be reversed to the original by using the decryption key.
So which one is better? Hashing vs encryption?
It depends on the purpose.
For instance, if you need to store passwords in a database, hashing is a better option for the fact that, in the event of a breach, a hacker won’t be able to access the passwords in plaintext and find out the passwords.
Of course, neither is perfect and a mix of hashing and encryption is usually the best to protect an organization and its data.
For instance, when you sign up to CTemplar, the RSA private and public keys are generated using your password as the private key passphrase and stored on the CTemplar server. The keys are then retrieved with a successful user login and the password never gets sent to the server in its plaintext form.
Instead, the password is hashed with the salt from your username and can’t be used to find out the actual password you made. This way, only you will know the actual password that you will need to decrypt an email. Not even CTemplar will be able to see your password. The hash is irreversible.
You can read more about how CTemplar manages your passwords and generates keys in our whitepaper.
Consistent hashing is a strategy used in systems with large distributed databases, where it becomes necessary to divide the data between many computers. This works independently of the number of servers and/or objects in a distributed hash table by assigning each a position in a hash ring, allowing the servers and object to scale.
Salt is a random data value added to the password (usually at the end) to produce a different hash value and add extra security against brute force attacks.
Hashing is a one-way process and there is no key that will convert the input to its original value. Encryption is a two-way process that uses keys to change text into ciphertext and back.
Asymmetric encryption uses a public key to encrypt data and a private key to decrypt it.
Double hashing is a collision-resolving technique used in Open Addressed hash tables and is an idea of using an additional hash function to the hash key in the event of a collision.
Both hashing and encryption have their important place in cryptography and data security, especially when it comes to password management.
Hashing, for instance, is a way to go if you are worried that a file you want to send to someone might be intercepted and changed by a 3rd-party.
On the other hand, encryption is best used if you need to send a message (like an email message) and ensure that only the right recipient will be able to read it.
At CTemplar, we protect your passwords and other data with both encryption (Open PGP, 4096-bit) and hashing algorithm (we use BCrypt for hashing passwords).
Sign up to CTemplar to regain your right to privacy now and protect your email data!