HTTP Under the Hood: Here's What Actually Happens
Most backend engineers think that HTTP is a stateless request-response protocol. That's technically correct but not complete. For the last few days I have been reading a lot about HTTP in depth. I'm going to talk about same in this post.
In This Post
- What is HTTP?
- What is a protocol?
- What happens when you browser any website?
- How is this HTTP structured
- How the server actually reads HTTP request
- HTTP 1.0 vs HTTP 1.1 vs HTTP 2.x
- Compression
- HTTP Connection Management
- HTTP Caching
HTTP is a protocol that allows us to transfer files and documents over the internet between a server and client (browser). It follows a request-response model in which your browser sends a request to the server and receives a response. This response is then displayed on your browser screen. It can be a webpage, image and even videos. This is what most better engineers know, but do you really know what happens under the hood? Let's first start with protocol.
What is Protocol?
A protocol is a set of rules and procedures that need to be followed. In computer science the concept is also the same. A protocol defines the rules that computers need to follow while communicating and sending data to each other. Some protocols also define how to share data and also the structure of the data. For example, HTTP uses the TCP protocol under the hood. So all the networking stuff is handled by TCP. HTTP will only show how to structure the data.
So we can understand that the HTTP protocol defines what data look like and then hand over the data to the TCP protocol for delivering it to the remote server.
For every protocol, a detailed document is published that contains all the rules that you need to follow while working with that protocol. The HTTP document can be found here.
What happens when you browser any website?
You opened your browser and typed sushantdhiman.dev in the URL bar and pressed enter, and magically, somehow, my website opened. But how did this thing happen? There are several steps involved in it.
Let’s remove the “magic” part.
Step 1: Your browser doesn’t know domains
Humans like names. But Computers don’t. Your browser cannot send a request to sushantdhiman.dev. It needs an IP address. So the first thing it does is try to get IP address associated with this doamin. This is where DNS comes in. Think of DNS as a distributed phonebook of the internet. Your browser asks:
“Give me the IP address for this domain.”
IPs are also cached somewhere (browser, OS, or ISP); if the IP is already cached, you get it instantly. Otherwise, a chain of DNS queries happens behind the scenes. It is the responsibility of the browser to talk to DNS and get IP.
Now the browser finally knows where to send the request.
Step 2: Connection is created
You can't just “send HTTP requests” to a server. First, you need a connection. As I told you earlier, this is handled by TCP.
Before any data is sent, your machine and the server establish a connection using the 3-way handshake. It is called a 3-way handshake because it requires 3 steps. A lot happens while creating connection, but we will stick to abstraction for simplicity. Here are those 3 steps:
- (SYN) - Client - “Hey, can we talk?”
- (SYN-ACK) - Server - “Yes, I’m ready.”
- (ACK) - Client - “Cool, let’s start.”
After this, you now have a reliable communication channel. But HTTP hasn’t even started yet.
Step 3: Finally — HTTP starts
Now your browser sends an HTTP request. A simple HTTP request looks something like this:
GET / HTTP/1.1
Host: sushantdhiman.dev
User-Agent: Mozilla/5.0 ...
Accept: text/htmlDon't worry; we will talk about this. It’s literally a string following a specific format defined by the HTTP protocol. This is the “rules” part we talked about earlier.
Step 4: Server sends response
The server sends something like:
HTTP/1.1 200 OK
Content-Type: text/html
<html>...</html>Again — just structured data. Your browser will then render the received HTML, and then you'll see a webpage.
How is this HTTP structured
Most engineers say “HTTP is just text”. That’s true, but again, incomplete. HTTP is not just text. It’s structured text with very strict rules. If you break those rules even slightly, the server won’t “try to understand you”. It will just reject the request. Let’s look at what actually gets sent.
When your browser sends a request, it’s not sending some JSON object or fancy abstraction. It literally writes bytes to a TCP socket in a very specific format.
A typical request looks like this:
GET /blog/http-deep-dive HTTP/1.1
Host: sushantdhiman.dev
User-Agent: Mozilla/5.0
Accept: text/html
Connection: keep-alive
There are three important parts here, and the order matters.
Request Line
GET /blog/http-deep-dive HTTP/1.1This line tells the server three things:
- What action you want (
GET) - What resource you want (
/blog/http-deep-dive) - Which version of HTTP you’re speaking (
HTTP/1.1)
If you mess this line up, nothing else matters.
Haders
Headers are just key-value pairs. If you have build any HTTP API you already know what headers are.
Host: sushantdhiman.dev
User-Agent: Mozilla/5.0
Accept: text/htmlThis is where most real-world behavior is controlled. For example:
Hosttells the server which website you’re trying to reach (important for shared servers)User-Agenttells what kind of client is making the requestAccepttells what kind of response you can handle
These are not optional in practice. For example, in HTTP/1.1, if you don’t send Host, many servers will reject your request.
Empty Line
Then comes an empty line. This is not random. This empty line is a delimiter. It tells the server:
“Headers are done. Body starts next.”
If you forget this, the server will keep waiting, thinking more headers are coming.
Body
The body is optional. You only send a body when you are sending a POST, PUT or PATCH request. A body can be a JSON object or simple text.
{
"name" : "Sushant"
}How the server actually reads HTTP request
This is where things get interesting. The server is not “parsing objects”. It’s reading a stream of bytes from a TCP connection.
Something like this (conceptually):
- Read until
\r\n→ that’s your request line - Keep reading lines until you hit an empty line → those are headers
- Check
Content-Length→ read that many bytes → that’s your body
That’s it. There’s no magic parsing engine. Just careful reading of a byte stream. This is why malformed requests break everything. The server is not guessing — it’s following strict rules.
Server Response
The server also follows the same structure for the response as we used for the request. Now your browser is responsible for parsing the response and showing you the result.
HTTP Versions
There are several HTTP versions. But most important are HTTP 1.0, HTTP 1.1 and HTTP 2. X. If you only look at HTTP from a “format” perspective, nothing seems to change much across versions. Still request → headers → body. But the real evolution of HTTP is not about structure. It’s about performance, connection management, and how data flows over the network.
HTTP 1.0
HTTP/1.0 was very straightforward. One request will open one TCP connection and receive only one response. Than the connection is closed. That’s it. If your webpage had 1 HTML file, 5 CSS files, 5 JS files, 10 images That’s 21 separate TCP connections.
And remember, each TCP connection requires 3-way handshake so it leads to slow start. So we are paying the cost of creating a TCP connection again and again.
HTTP 1.1
HTTP/1.1 didn’t reinvent HTTP. It fixed the obvious inefficiencies. The biggest change was persistent connections. Now instead of closing the connection after one request, we can reuse it.
But then came a new problem. If you send multiple requests on the same connection, responses must come in order. This is called Head-of-Line Blocking.
Example:
- Request A (slow)
- Request B (fast)
Even if B is ready first, it has to wait for A. So now we have a faster connection, but still limited by ordering.
HTTP 2.0
HTTP/2 didn’t change what HTTP means. But it completely changed how data is sent. Some notable improvements were.
- Multiplexing - Instead of one request at a time per connection, HTTP/2 allows: Multiple requests and responses in parallel over a single TCP connection. No ordering constraint like HTTP/1.1. So request B can complete first. No waiting. This removes the need for most of the hacks we discussed earlier.
- Binary Protocol - HTTP/1.x was human-readable. HTTP/2 is binary framed. You won’t see
GET / HTTP/1.1Instead, data is split into frames and encoded efficiently. This improves parsing speed, network efficiency and compression but you lose readability. Although you can use other tools for that. - Header Compression - Headers in HTTP are often repetitive like User-Agent, Cookie and Accept. Sending them again and again is wasteful. HTTP/2 uses HPACK compression to reduce this overhead. In real systems, this saves a lot of bandwidth.
Compression
When you send an HTTP response, you’re not just sending “data”. You’re sending bytes over a network. And network is almost always the bottleneck. Not CPU. Not memory but Network latency + bandwidth. What if we could send fewer bytes for the same response? That’s exactly what HTTP compression does.
The client tells the server (via request headers):
Accept-Encoding: gzip, deflate, brThis means: “I can understand compressed responses using these algorithms.” Now the server decides if it supports compression, which algorithm to use and then responds like this:
Content-Encoding: gzipThe received request body is compressed, and the browser can use the gzip algorithm to decompress it.
Important detail most people miss that HTTP is not compressing anything by itself. It doesn’t “know” gzip or brotli. It just defines headers like: Accept-Encoding Content-Encoding . The actual compression is done by the server (or proxy like Nginx/CDN). Again — HTTP defines rules, not behavior.
HTTP Connection Management
You can build a perfectly working service and still take it down with bad connection management. Not because your logic is wrong, but because your connections are. HTTP doesn’t run in isolation, it runs on top of TCP connections that need careful handling.
Keep Alive
HTTP Keep-Alive is a method for maintaining persistent network connections between a client and server, allowing multiple requests/responses to use a single TCP connection rather than opening a new one each time. But here’s the catch. Keeping connections open means you’re holding resources such as File descriptors , Memory and Kernel State. So it’s not “free performance”. It’s a tradeoff.
Idle Timeouts
If you keep connections alive forever, your server will run out of resources. So every system introduces idle timeouts. “If no request comes for X seconds, close the connection.” Simple idea. Subtle problems. But!!!!
If your timeout is too low:
- Connections close too often → more TCP handshakes → latency
If it’s too high:
- You waste resources on idle clients
Connection pooling
Modern clients are smart enough. They don’t open a new connection per request. They maintain a connection pool. Instead of: open → use → close. They do: open → reuse → reuse → reuse. For example: Browsers reuse connections per domain Backend services (Go, Node, Java) maintain pools internally. This improves performance a lot.
Load Balancer
In real systems, your client is not directly talking to your server. There’s usually a load balancer in between. Example flow: Client → Load Balancer → Backend . Now here’s something subtle: Client ↔ LB connection LB ↔ Backend connection These are two different connections. Load balancers often reuse backend connections aggressively.
HTTP Caching
We can cache HTTP responses and reduce load on the server. The core idea is that when a client receives a response, it can store it and reuse it later instead of hitting the server again. But this is not random. HTTP caching is controlled explicitly using headers.
Example:
Let’s say your server returns this:
HTTP/1.1 200 OK
Cache-Control: max-age=60
Content-Type: application/json
{ "posts": [...] }This means: “You can reuse this response for the next 60 seconds without asking me again.”. Now if the user refreshes the page within 60 seconds than no request goes to the server and response is served instantly from cache.
Before You Go
If you made it this far, Thank You.
I usually write about backend engineering, distributed systems, and things I learn while working on real problems. Not theory — mostly practical stuff that I wish someone had explained to me earlier.
I run a free newsletter where I share these kinds of write-ups. No spam. Just occasional backend engineering notes.