In the first article you learned about encryption basics. In the second part, I showed you the difference between symmetric and asymmetric encryption. And now you’ve graduated to the hypnotizing world of hashing.
We’re going to talk about data integrity and how to make sure the stuff you send isn’t changed. Not even a single character can escape the sweeping eye of data integrity.
Join me. We’re about to kick some crypto butt.
Now I don’t know about you but the first time I heard this term I envisioned a steaming hot hash brown in a paper McDonalds sleeve.
Mmmm, my mouth is already watering like a German Shepard.
I don’t know what’s wrong with me – okay maybe I’m weird but anyway – hashing is super important. There are two primary algorithms; two processes for creating the hash:
- SHA (pronounced SHAH! Like the SHAHs of Sunset on Bravo TV. It stands for the Secure Hashing Algorithm)
lol okay, so here’s how it works:
A hash takes information, sends it through a hashing machine (which just runs a bunch of fancy calculations) and then spits out what’s known as a digest.
Let me show you how this works with a real life example:
Let’s say you’re writing a legal contract.
No wait, let’s say it’s a privacy document and you work for Apple and you want to make sure the recipient is signing the same copy you sent him. You need a way to ensure the integrity of the document because if the recipient changes the words he could effectively trick you into agreeing to different terms.
And that’s not very nice.
It doesn’t matter if the document is 12 pages or 1,200 pages long, you’ll always get a concise hash that represents the input stream data.
You basically feed the data to a MD5 monster and he farts out your stinky digest hahaha
If both digests match then you know nothing changed; however, if the hash is different then something changed. The hash doesn’t tell you what changed (you won’t know which line in the contract is different) but it does tell the recipient that the document she’s holding is different than the one you drafted.
With cryptography you can apply hashing to almost any piece of data. So you can hash packets to make sure an intruder hasn’t intercepted them, modified your bank withdrawal request from $200 to $200,000 and then sent them to your bank on your behalf.
Hashing is also important when downloading files. For example, how do you know the Linux ISO file you grabbed from Ubuntu is the real deal? How do you know it wasn’t modified? How do you know the file wasn’t corrupted in transit?
Instead of wasting your time with a bad ISO and having the installation freeze at 17% you can run a MD5 or SHA checksum on the file to make sure it matches the published hash.
On a Mac you can use openssl to check the checksum.
openssl md5 ubuntu-14.10-desktop-amd64.iso
But wait, there’s a problem. A freggin’ big problem and I want to see if you can guess it.
Hacking the Hash
Let’s say an unethical hacker running Kali Linux is collecting frames over the air. He sees your document and the hash you generated and decides to change the document and then run this new modified file through an MD5 hash to create a new digest.
He then fowards the recipient the modified document and hash. The recipient gets the document and runs it through the MD5 hash and confirms it matches the digest sent by the attacker.
So a hacker can:
- Intercept the message and hash
- Change the message
- Generate a new digest
- Send it to the recipient without the recipient or the original sender knowing a thing
The digests will still match! It’ll match the attackers digest and not yours (the sender) but the recipient wouldn’t know that. That’s why we need a way to protect the hash.
Say hi to the HMAC
HMAC is the Hashed Message Authentication Code. Instead of just hashing a file, we include a secret that only the sender and recipient know. If the hacker tried to modify the file and calculate a new hash, it wouldn’t check out. Since only the send and receiver have the key the recipients hash calculation would be different from the attackers.
HMAC is a symmetric algorithm because it uses the same key for the sender and receiver. So if the man-in-the-middle (MITM) changes the document or key then the digests will change and when the sender runs the document through the hash the digests won’t match and she’ll know something is wrong.
That’s how it works. We use hashing to preserve data integrity which guarantees that our data can’t be modified behind our backs and then we can encrypt it to make sure no one can even read it. We might also authenticate the peer so that we know we’re connecting to the right person. This dovetails nicely into VPNs but we’ll have to cover that in a different series.
Making sure it’s always there
Availability is the A in our CIA triad. If someone physical steals, breaks or changes a network resource we need to make sure people still have a way of connecting to it. That’s what availability is all about. It doesn’t matter if you have confidentiality with integrity if no one can access the web server because an attacker is launching a denial of service (DoS) on it.
You can maximize availability with fault tolerance. You can have multiple devices that share the burden. This is commonly called a high availability (HA) setup because if one device fails a standby device can seamlessly take over.
Mr Router, can I have your autograph?
A digital signature is a way to prove data origination. It’s a way to prove that something was sent from a specific device and not an imposter.
This is how it works:
Let’s say you want to prove to your employees that a message came from you. How would you do that?
You would generate a public/private keypair. Then you would run your message through a hashing algorithm to create a unique message digest.
Next, you would use your private key to encrypt the hash. That’s what a digital signature is.
You might wonder why you wouldn’t encrypt the data instead? Well that’s a good question but it ultimately comes down to performance. It’s takes less resources to encrypt a fixed length hash than it does to encrypt a message of arbitrary length.
This makes the digital signature unique to both the message and the hash. It binds the message to the sender and makes it virtually impossible for the sender to deny signing the message (assuming no one has compromised the key). It also gives your employees assurance of the origin because the signing was invoked with your private key. (which is known only to you)
Alright so what happens next?
You send your employees your public key, the data and the encrypted digest.
They run the message through the same hashing algorithm to get a message digest. Then they’ll use your public key to decrypt the encrypted digest. If the decrypted digest matches the message digest they know not only was the data tamper proofed in transit but also it originated from you because no one else has your private key!
And that’s how it works!
The Bottom Line
Cryptography is an amazing field of computer science. If anything I shared interested you, please let me know in the comments below!