Packets and routing

How data travels across the internet

1,596 words8 min read

When you click a link, something remarkable happens. Your request - a simple string of text - gets chopped up into pieces, wrapped in multiple layers of addressing information, and hurled across a global network of interconnected machines. Within milliseconds, it reaches a server that might be on the other side of the planet. How does your data know where to go? And how does it get there so fast?

The answer lies in one of the most elegant engineering solutions ever devised: [[packet switching]]. Instead of establishing a dedicated connection between two computers (like old telephone lines), the internet breaks your data into small chunks called [[packets]] and routes each one independently. It's like sending a book through the mail by ripping out each page and mailing them separately - except the postal system is smart enough to reassemble them in the right order.

This approach has a profound advantage: efficiency. In a circuit-switched network (like traditional phone calls), a dedicated path is reserved for the entire duration of the conversation, even during silences. With packet switching, the network's capacity is shared dynamically. Your packets interleave with everyone else's, and idle connections don't waste resources. This is why the internet can support billions of simultaneous connections on shared infrastructure.

What's inside a packet?

Every packet is essentially an envelope containing two things: a [[header]] (the addressing information) and a [[payload]] (the actual data). The header tells routers where the packet came from, where it's going, and how to handle it. The payload carries your actual message - a piece of an image, some HTML, part of a video frame.

But packets don't exist in isolation - they're wrapped in multiple layers of headers as they travel down the network stack. This is called [[encapsulation]]. Your HTTP request becomes the payload of a TCP segment. That segment becomes the payload of an IP packet. That packet becomes the payload of an Ethernet frame. Each layer adds its own header with information relevant to that layer's job.

Packet Structure

Source Port
16 bits
Dest Port
16 bits
Sequence Number
32 bits
Acknowledgment
32 bits
Offset
4 bits
Reserved
3 bits
Flags
9 bits
Window Size
16 bits
Checksum
16 bits
Urgent Pointer
16 bits
Options + Padding
DATA PAYLOAD
Hover over a field to see its description
Header Size
20-60 bytes
Connection
Connection-oriented
Compare TCP and UDP packet structures - hover over fields to understand each component

An IPv4 header contains 20+ bytes of crucial information: version number, header length, type of service (for quality of service), total length, identification (for fragment reassembly), flags, fragment offset, time to live (TTL), protocol number (TCP=6, UDP=17), header checksum, source IP address, and destination IP address. Every router examines this header to make forwarding decisions.

TCP (Transmission Control Protocol) packets carry additional overhead - sequence numbers, acknowledgments, checksums, window sizes, and control flags. This makes TCP reliable: if a packet gets lost, the sender knows to resend it. If packets arrive out of order, the receiver can reassemble them correctly. The sequence number tracks exactly which byte position this segment represents in the overall data stream.

UDP (User Datagram Protocol) strips away most of that overhead - just source port, destination port, length, and checksum. It's a 'fire and forget' protocol - you send the packet and hope it arrives. This sounds unreliable, but it's perfect for real-time applications like video calls, gaming, or DNS queries where a slightly corrupted frame is better than a delayed one. By the time a lost packet could be resent, you've already moved on to the next frame.

The TCP three-way handshake

Before TCP can send any data, it needs to establish a connection. This happens through a ritual called the [[three-way handshake]], which synchronizes both sides and establishes initial parameters:

  • SYN: Your computer sends a packet with the SYN (synchronize) flag set, containing an initial sequence number (ISN). This says 'I want to talk, and I'll start numbering my bytes from here.'
  • SYN-ACK: The server responds with both SYN and ACK flags, its own ISN, and an acknowledgment of your ISN+1. This says 'I hear you, here's my starting number, and I'm ready to receive your next byte.'
  • ACK: Your computer sends a final ACK confirming receipt of the server's SYN. The connection is now established and data can flow in both directions.

It seems wasteful - three round trips before you can send anything! But it ensures both sides are ready and establishes the initial [[sequence numbers]] that keep everything in order. The ISN is randomized to prevent sequence number prediction attacks, where an attacker could inject packets into an existing connection.

TCP also implements [[flow control]] through a sliding window mechanism. The receiver advertises how much buffer space it has available (the 'window size'), and the sender limits how much unacknowledged data it can have in flight. This prevents a fast sender from overwhelming a slow receiver. Modern TCP also implements [[congestion control]] algorithms like CUBIC and BBR that dynamically adjust sending rate based on network conditions.

IP addresses and routing

Every device on the internet has an [[IP address]] - a unique identifier that routers use to forward packets. IPv4 addresses look like 192.168.1.1 (four numbers from 0-255, representing 32 bits), while IPv6 addresses are longer hexadecimal strings like 2001:0db8:85a3:0000:0000:8a2e:0370:7334 (128 bits, allowing for 340 undecillion addresses).

IPv4 addresses are divided into network and host portions using [[subnet masks]]. A mask like 255.255.255.0 (/24 in CIDR notation) means the first 24 bits identify the network, and the last 8 bits identify specific hosts. This hierarchical structure is essential for routing - routers don't need to know about every individual device, just which direction to send packets for each network prefix.

When a router receives a packet, it looks at the destination IP address and consults its [[routing table]] - essentially a lookup table that says 'packets for this range of addresses should go out this port.' Routers don't know the full path to the destination; they just know the next hop. Each router makes an independent forwarding decision, passing the packet closer to its destination.

Routing tables are populated by [[routing protocols]] - BGP (Border Gateway Protocol) between autonomous systems (like ISPs), and interior protocols like OSPF or IS-IS within organizations. BGP is particularly crucial - it's the protocol that glues the internet together, with each ISP announcing which IP prefixes it can reach. When BGP goes wrong (misconfigurations, hijacking attacks), large portions of the internet can become unreachable.

NAT: stretching IPv4

There are only about 4.3 billion possible IPv4 addresses, far fewer than the number of devices connected to the internet today. [[NAT]] (Network Address Translation) is the hack that's kept IPv4 alive. Your home router has one public IP address assigned by your ISP, but all devices on your home network share it using private addresses (like 192.168.x.x or 10.x.x.x).

When you make a request, your router rewrites the source IP and port, keeping track of the mapping in a translation table. When responses come back, it reverses the translation. This works well for outgoing connections but complicates incoming connections - you can't easily run a server behind NAT without port forwarding or NAT traversal techniques like STUN and TURN (commonly used for video calls).

MTU and fragmentation

Networks have a [[Maximum Transmission Unit]] (MTU) - the largest packet size they can handle. Ethernet has a standard MTU of 1500 bytes. If a packet exceeds the MTU of any link along its path, it must be [[fragmented]] into smaller pieces. Each fragment gets its own IP header and travels independently; they're reassembled only at the final destination.

Fragmentation is problematic - it increases overhead, and if any fragment is lost, the entire original packet must be retransmitted. Modern systems use [[Path MTU Discovery]] to find the smallest MTU along the path and size packets accordingly. TCP negotiates a [[Maximum Segment Size]] (MSS) during connection setup to avoid fragmentation entirely.

TTL and preventing infinite loops

What stops a packet from bouncing around the internet forever if something goes wrong? The [[TTL]] (Time To Live) field. Despite its name, TTL isn't measured in time - it's a hop count. Each router decrements the TTL by one. When it hits zero, the packet is discarded and an ICMP 'Time Exceeded' message is sent back to the sender.

This is how the traceroute command works - it sends packets with incrementally increasing TTLs (1, 2, 3...) and records which router sends back the 'TTL expired' message at each hop, mapping out the path your packets take across the internet.

# Trace the route to example.com
$ traceroute example.com

1  router.local (192.168.1.1)  1.234 ms
2  isp-gateway (10.0.0.1)  12.456 ms
3  core-router.isp.net (72.14.215.1)  15.789 ms
4  edge-router.backbone.net (209.85.142.5)  22.345 ms
... (more hops) ...
12 93.184.216.34  45.123 ms

Quality of Service and traffic shaping

Not all packets are created equal. Video call audio is more time-sensitive than email. The [[Type of Service]] (ToS) field in the IP header (now called [[DSCP]] - Differentiated Services Code Point) allows packets to be marked with priority levels. Routers can use this to implement [[QoS]] policies - putting voice traffic in a priority queue, throttling bulk downloads during congestion.

ISPs use traffic shaping to manage their networks - policing bandwidth limits, prioritizing certain traffic types, and sometimes controversially throttling specific services. Net neutrality debates often center on whether ISPs should be allowed to discriminate between packets based on their source or content.

The internet is not a big truck. It's a series of tubes.

Senator Ted Stevens (2006) - technically more accurate than he knew
How Things Work - A Visual Guide to Technology