Deep Dive - Networking for Web Engineers
What happens between the moment your code calls fetch and the moment the server receives that request?
You write code that runs on the web. You build REST APIs in Node or Go. You fetch data in React. You understand your code perfectly.
But what happens between the moment your code calls fetch and the moment the server receives that request?
For many engineers this space is a black box. They treat the network as a magic pipe that instantly teleports data. This ignorance creates performance bottlenecks that code optimization cannot fix. If you do not understand the cost of a handshake or the physics of latency you cannot build a truly performant application.
This post will peel back the layers of abstraction. We will trace the life of an HTTP request not through code but through the wires and protocols that make the web possible.
The Invisible Database (DNS)
Everything starts with a hostname like api.google.com. Your computer does not know how to route data to a name. It needs an IP address.
Most developers think of DNS as a simple lookup. You ask for a name you get a number. In reality DNS is a complex hierarchical distributed database with heavy caching implications.
The Recursive Resolver and ECS
When your browser needs an IP it does not go straight to Google. It asks a Recursive Resolver usually provided by your ISP or a public provider like Cloudflare 1.1.1.1.
But how does Google know to give you an IP in Mumbai instead of New York?
The Resolver sends your partial IP address to the authoritative server using an extension called EDNS Client Subnet (ECS).
If you use a privacy focused resolver that strips ECS you might actually get slower speeds because the CDN cannot geographically locate you accurately.
The TTL Trap
Every DNS record has a Time To Live (TTL). This is a number in seconds that tells the resolver how long to cache the IP.
If you mess up a deployment and switch your traffic to a broken load balancer you update your DNS record to point back to the old one. But if your TTL is 3600 seconds (1 hour) then users will continue hitting the broken server for an hour regardless of your fix.
The best practice is to keep TTL low (e.g. 60 seconds) for critical infrastructure to allow for fast failover.
The Cost of Connection (Handshake)
Once you have the IP address you cannot just send data. You must establish a connection. This is where the laws of physics hurt your application performance.
The TCP 3 Way Handshake
Before a single byte of your HTTP request is sent the client and server must agree to speak. This is the SYN SYN-ACK ACK dance.
Client sends
SYN(Synchronize) with a random Sequence Number.Server replies with
SYN-ACK(Synchronize Acknowledge).Client replies with
ACK.
This takes 1.5 Round Trips (RTT).
If your user is in Delhi and your server is in Virginia the RTT might be 250ms. The user stares at a white screen for nearly 400ms just setting up the pipe.
TCP Fast Open (TFO) - Modern kernels support TFO. This allows the client to send data inside the SYN packet effectively removing a full round trip of latency. However this requires support from both the OS and the intervening routers.
The TLS Handshake (HTTPS)
It gets worse. The web is secure so we need encryption. This requires a TLS Handshake on top of the TCP handshake.
TLS 1.2 Adds another 2 RTTs.
TLS 1.3 Optimized to 1 RTT by guessing the key exchange parameters (e.g. Diffie Hellman) in the first hello message.
For a new secure connection the user waits 3 to 4 Round Trips before sending their request. On a mobile 4G network this can easily be 600ms of delay. This is why Keep Alive connections are critical. They allow you to reuse an existing handshake for multiple requests avoiding this setup penalty.
The Guarantee of Order
Web engineers love HTTP but HTTP is just a text protocol running inside TCP packets. TCP provides the magic of reliability.
The internet is chaotic. Routers drop packets. Packets arrive out of order. TCP fixes this using Sequence Numbers and the Sliding Window.
If you send three packets A B and C but they arrive as C A B the server’s TCP stack waits rearranges them into A B C and then hands the data to your Node or Java application.
Head of Line Blocking
This reliability comes at a cost. If packet A is lost but packet B and C arrive the operating system cannot give B and C to your application yet. It must wait for A to be retransmitted.
This is Head of Line Blocking. Your application perceives this as “network lag” or a “stalled request” but actually the data is sitting right there in the kernel waiting for the missing piece.
HTTP Versions and Speed Evolution
The application layer protocol has evolved specifically to solve these networking constraints.
The Serial Era (HTTP 1.1)
In HTTP 1.1 you can only send one request at a time per connection.
If you need to fetch 10 images the browser opens 6 connections (the browser limit) downloads 6 images and makes the other 4 wait. This creates the “Waterfall” in the Chrome Network tab.
The Multiplexing Era (HTTP 2)
HTTP 2 introduced the Binary Framing Layer. It chops requests and responses into tiny frames and mixes them on the same wire.
Multiplexing - You can send infinite requests over a single TCP connection simultaneously.
HPACK Compression - In HTTP 1.1 every request sends massive headers like
User-AgentandCookies. HTTP 2 uses HPACK to compress these headers often reducing overhead by 90%.The Flaw - It still runs on TCP so it still suffers from TCP Head of Line Blocking. If one TCP packet drops all the multiplexed streams pause.
The UDP Revolution (HTTP 3)
This is the bleeding edge. HTTP 3 runs on QUIC which sits on top of UDP not TCP
Why UDP?
UDP does not care about order. It fires packets and forgets them.
QUIC
Adds reliability back on top of UDP but it does so per stream. If packet A is lost only stream A waits. Stream B and C continue processing.
Connection Migration
In TCP your connection is defined by your IP. If you switch from Wi-Fi to 4G your IP changes and the connection dies. QUIC uses a Connection ID. You can switch networks and the download continues without interruption.
This move to UDP is not just theoretical. In my earlier post The Spotify Story Rethinking the Network Layer I detailed how Spotify faced these exact latency challenges. They realized that for streaming audio TCP’s strict ordering was often a liability. By rethinking the layer below HTTP they optimized the experience long before HTTP 3 became the global standard.
Congestion Control (TCP Slow Start)
Have you ever noticed that a video starts pixelated and then clears up? Or that a download starts slow and then speeds up?
This is TCP Slow Start governed by the Congestion Window (CWND).
The server does not know how much bandwidth the client has. If it sends 100MB at once it might crash the router.
The server starts with an Initial Congestion Window (initcwnd) of 10 packets (approx 14KB).
If the client sends an ACK the server doubles the window.
It keeps doubling until packets start getting dropped (Congestion Avoidance)
The 14KB Rule
Because of this slow start the first 14KB of your HTML or API response is special. It is the only data that can be delivered in the first round trip.
If your critical CSS or JSON data fits in that first 14KB your site will feel instant. If it is 15KB the user waits for another full round trip.
Conclusion
Networking is not just piping bytes. It is a complex management of state (handshakes) queues (buffers) and reliability guarantees (ACKs).
As a web engineer you cannot fix the speed of light. But you can respect it.
Use CDNs and ECS to bring the server closer to the user reducing RTT.
Reuse connections with Keep Alive to avoid the handshake tax.
Understand Slow Start and fit your critical path into the first 14KB.
The network is the most hostile environment your code will ever run in. Design accordingly.


