How does HTTPS actually work? That was the question I set out to solve a few days ago for a project at work.
As a web developer, I knew that using HTTPS to protect users’ sensitive data was A Very Good Idea, but I didn’t have much understanding about how it actually worked.
How was data protected? How can a client and server create a secure connection if someone was already listening in on the wire? What is a security certificate and why do I need to pay someone to get one?
A Series of Tubes
Before we dive into how it all works, let’s talk briefly about why it’s important to secure connections in the first place, and what sorts of things HTTPS guards against.
When you make a request to visit your favorite website, that request must pass through many different networks — any of which could be used to potentially eavesdrop or tamper with your connection.
From your own computer to other machines on your local network, to the access point itself, through routers and switches all the way to the ISP and through the backbone providers, there are a lot of different organizations who ferry a request along. If a malicious user got into any one of those systems, then they have the potential to see what’s traveling through the wire.
Normally, web requests are sent over regular ol’ HTTP, where a client’s request and the server’s response are both sent as plain text. There are lots of good reasons why HTTP doesn’t use secure encryption by default:
- Security requires more computation power
- Security requires more bandwidth
- Security breaks caching
But sometimes, as the developer of a web application, you know that sensitive information like passwords or credit card data will be going over the connection, so it’s necessary to take extra precautions against snooping on those pages.
Transport Layer Security (TLS)
We’re about to dive into the world of cryptography, but you shouldn’t need much experience to keep up. We’ll really only be scratching the surface.
Cryptography is the practice of securing communications against potential adversaries — people who might want to interfere with the communication, or just listen in.
TLS — the successor to SSL — is a protocol that’s most often used to implement secure HTTP connections (ie HTTPS). TLS sits at a lower level in the OSI model than HTTP, which is basically a fancy way of saying that, during a web request the TLS connection stuff happens before the HTTP connection stuff.
TLS is a hybrid cryptographic system, meaning it makes use of multiple crypto paradigms, both of which we’ll look at next:
Public Key Cryptography for shared secret generation and authentication (making sure you are who you say you are).
Symmetric Key Cryptography using shared secrets for encrypting requests and responses.
Public Key Encryption
Public key encryption is a type of cryptographic system where each party has both a private and a public key, which are mathematically linked to each other. The public key is used for encrypting plaintext to “ciphertext” (essentially, gibberish), while the private key is used for decrypting that gibberish back into plaintext.
Once a message has been encrypted by a public key, it can only be decrypted with the corresponding private key. Neither key can perform both functions by itself. The public key can be published freely without compromising the security of the system, but the private key must not be revealed to anyone who isn’t authorized to decrypt messages. Hence the names, public and private.
One of the cool benefits of public key cryptography is that two parties with no prior knowledge of each other can create a secure connection while initially communicating over an open, insecure connection.
The client and the server can both use their own private keys — along with some shared, public information — to agree upon a shared secret key for the session.
That means that even if someone is sitting in between the client and server and watches the connection happen, they still can’t determine the private keys of either the client or the server, or the secret key for the session.
How is this possible? Math!
One of the most common way this exchange is performed is by using a Diffie-Hellman key exchange. This process allows the client and sever to agree upon a shared secret, without having to transmit that secret over the connection. Again, snoopers can’t determine the shared secret even if they’re watching every packet on the connection.
Once the initial DH exchange takes place, the resulting shared secret can be used to encrypt further communications in that session using a much simpler symmetric key encryption, which we’ll look at in a bit.
A Bit of Math…
The math behind it is actually fairly simple to calculate one way, but essentially impossible to reverse. This is where the importance of having really large prime numbers comes into play.
If Alice and Bob are two parties performing a DH key exchange, they start by agreeing on a root (generally a small number, like 2, 3 or 5) and a large prime (300+ digits), both of which can be sent in the clear without compromising the security of the exchange.
Remember, Alice and Bob each have their own private key (100+ digits) that should never be shared, either between them or with anyone else. What they exchange publicly over the network is a mixture of their private keys, plus the root and the prime. Specifically:
Alice’s mixture = (root Alice’s Secret) % prime
Bob’s mixture = (root Bob’s Secret) % prime
% here means modulo, taking the remainder after division
So Alice creates her mixture using the agreed upon constants (root and prime) plus her private key, and Bob does the same. Once they’ve received each other’s mixture, they then perform some more math to derive the shared secret for the session. Specifically:
(Bob’s mixture Alice’s Secret) % prime
(Alice’s mixture Bob’s Secret) % prime
This calculation generates the same number for both Alice and Bob, and that number becomes the shared secret for this session. Note that neither party had to send their private key to the other one, and the resulting shared secret was also never sent over the connection. Brilliant!
For those who are less math-inclined, the Wikipedia article has a great image involving mixing colors:
Notice how the starting color (yellow) ends up getting “mixed” with both Alice’s color and Bob’s color. That’s how it ends up being the same for both parties at the end. The only thing that’s sent over the connection is the half-way-done mixture which is meaningless to anyone watching the connection.
Symmetric Key Encryption
This public key exchange only needs to happen once per session, the first time the client and server connect. Once they’ve agreed on a shared secret, the client and server communicate using a symmetric-key crypto system which is much more efficient to communicate on since it saves an extra round-trip each exchange.
With the shared secret they agreed upon earlier, plus an agreed-upon cipher suite (essentially a collection of encryption algorithms), the client and server can now communicate securely, encrypting and decrypting each others’ messages using the shared secret, with a snooper just seeing gibberish going back and forth.
The Diffie-Hellman key exchange allows two parties to create a private, shared secret. But how do the two parties know they’re talking to the correct entity? We haven’t talked about authentication yet.
What if I picked up the phone and called my friend and we performed a Diffie-Hellman key exchange, but it turns out my call was intercepted and I was actually talking to someone else? I’d still be able to communicate securely with that person — no one else would be able to decode our communication once we negotiated the shared secret — but they’re not who I thought I would be talking to. That’s not very secure!
To solve the authentication problem, we need a Public Key Infrastructure to make sure that entities are who they say they are. These infrastructures are set up to create, manage, distribute and revoke signed certificates. Certificates are those annoying things you have to pay for in order to serve your site over HTTPS.
But what exactly is a certificate, and how does it make things more secure?
At a high level: a public key certificate is a file that uses a digital signature (more on that in a minute) to bind a machine’s public key with an identity. The digital signature on the certificate is someone vouching for the fact that a particular public key belongs to a particular individual or organization.
Certificates essentially associate domain names (the identities) with a particular public key. This prevents a snooper from presenting their own public key, pretending to be the server a client is trying to reach.
In the phone call example above, the attacker could try presenting his public key, pretending he’s my friend — but the signature on that certificate wouldn’t be from someone I trusted.
In order to be trusted by the average web browser, certificates have to be signed by a trusted Certificate Authority (CA). CAs are companies that perform manual inspection and review, to make sure that the applying entity is both:
- a real person or business that exists in the public record
- in control of the domain they’re applying for a signed certificate for
Once the CA verifies that the applicant is real and really owns the domain, the CA will “sign” the site’s certificate, essentially putting their stamp of approval on the fact that this site’s public key really belongs to them and should be trusted.
Your browser comes preloaded with a list of trusted CAs. If a sever returns a certificate that isn’t signed by a trusted CA, it will flash a big red error warning. Otherwise anyone could go around “signing” bogus certificates. There needs to be a layer of trust in the system.
So even if an attacker were to take their machine’s own public key and generate a certificate saying that public key was associated with facebook.com, a browser wouldn’t trust it since that certificate isn’t signed by a trusted CA.
Other Things to Know About Certificates
In addition to the regular X.509 certificates, a extended validation certificates promise a stronger layer of trust.
When granting an extended validation certificate, the CAs must do even more checking into the identity of the entity who owns the domain (usually requiring passport or utility bills).
This type of certificate turns the browser bar green, in addition to showing the usual padlock icon.
Serving Multiple Websites from the Same Server
Because the TLS handshake occurs before the HTTP connection begins, there can be problems if there are multiple websites hosted on the same server, at the same IP address.
The named virtual hosts routing happens in the web server, but the handshake happens before the connection reaches that point. The single certificate for that system needs to be sent on requests to any of the sites hosted on that machine, which can create problems for shared hosting environments.
If you’re using a web hosting company, they’ll usually require that you purchase a dedicated IP address before you can get HTTPS set up for your website. Otherwise they’d constantly need to get new certificates (and get them re-verified by CAs) every time a site on that machine updated.
Wikipedia is a great resource for this stuff, and this Coursera course looks especially interesting. Thanks to the guys in the security.stackexchange.com chat room for answering some of my questions this morning.