What Is Network Latency? Propagation, Queuing, and RTT

Network latency is the time it takes for a unit of data to travel from its source to its destination. It is not bandwidth, not throughput, not packet loss — it is pure time delay, measured in milliseconds. Latency is what determines whether an application feels instant or sluggish, whether a video call is natural or frustrating, whether a financial trading system captures an arbitrage opportunity or misses it by a microsecond. Understanding where latency comes from, how it compounds, and how it can be reduced requires looking at physics, protocol design, queuing theory, and network architecture together.

The Four Sources of Latency

Total end-to-end latency is the sum of four fundamentally different types of delay, each with different causes and different remedies.

1. Propagation delay is the time for a signal to travel from point A to point B. In fiber optic cable, light travels at roughly 200,000 km/s — about two-thirds of the speed of light in a vacuum (c × 0.66, depending on the glass's refractive index). This gives approximately 5 microseconds per kilometer, or 1 millisecond per 200 km. A fiber path from New York to London is roughly 6,800 km of physical cable (following the ocean floor, not a straight line), yielding a minimum one-way propagation delay of about 34 ms. Round-trip time (RTT) is therefore at least 68 ms — and that is the absolute minimum, assuming zero other delays and perfectly direct fiber routing. Real transatlantic RTTs are typically 70–85 ms due to indirect cable paths and amplifier delays.

Propagation delay cannot be reduced without bending the laws of physics or finding shorter physical paths. This is why submarine cable routing matters and why companies pay enormous sums for cable landing rights at geographically advantageous locations.

2. Serialization delay is the time to push all the bits of a packet onto the wire. A 1500-byte Ethernet frame (12,000 bits) on a 1 Gbps link takes 12 microseconds to serialize. On a 100 Mbps link, the same frame takes 120 microseconds. On a 1.5 Mbps DSL uplink, a 1500-byte frame takes 8 milliseconds to transmit — enough to noticeably delay a VoIP packet queued behind a data transfer. Serialization delay is inversely proportional to link speed and disappears as a concern on high-speed links (10G+), but remains significant on access links and low-speed WAN circuits.

3. Queuing delay is the time a packet spends waiting in a buffer to be transmitted, because the link is currently busy transmitting earlier packets. This is the most variable and most controllable source of latency. Under zero load, queuing delay is zero. Under high load, packets pile up in buffers and wait their turn. The pathological case is bufferbloat: overly large buffers absorb bursts but add hundreds of milliseconds of queuing delay at sustained high utilization.

4. Processing delay is the time routers, switches, and network devices spend making forwarding decisions, looking up routes, applying firewall rules, and performing NAT translations. On modern hardware with dedicated ASICs, per-packet processing is typically sub-microsecond. On software routers or heavily loaded systems, processing can add milliseconds per hop. In complex deep packet inspection or proxy environments, processing delay can dominate.

Worked Examples: RTT by Path Type

Path Approx. Distance Typical RTT Limiting Factor
Same data center (same rack) <100 m 0.1–0.5 ms Processing, serialization
Cross-city (NYC to Boston) ~350 km fiber 4–8 ms Propagation + router hops
Coast-to-coast US (NYC–LA) ~4,500 km fiber 60–80 ms Propagation
Transatlantic (NYC–London) ~6,800 km fiber 70–85 ms Propagation (submarine cable)
US to Asia-Pacific (NYC–Tokyo) ~13,000 km fiber 150–180 ms Propagation
Geostationary satellite (GEO) ~72,000 km (2× 36,000 km) 500–700 ms Propagation (altitude)
LEO satellite (Starlink) ~1,100 km (550 km alt × 2) 25–50 ms Propagation + routing overhead

Geostationary satellites orbit at 35,786 km. A signal must travel from Earth to the satellite and back — at minimum 71,572 km, adding about 358 ms one-way, 716 ms round trip from pure propagation alone. Real GEO RTTs are 500–700 ms because the signal path is never perfectly straight and the uplink/downlink ground infrastructure adds processing. This makes GEO fundamentally unsuitable for interactive applications regardless of how much bandwidth the link provides.

Starlink LEO satellites orbit at ~550 km. The one-way propagation to the satellite is about 2.75 ms, making round-trip propagation roughly 10–15 ms, but the signal must still traverse the ground network to reach its destination. Real Starlink RTTs are 25–50 ms, competitive with cable internet for long-distance paths but not for local traffic.

Latency Is Not Bandwidth

Bandwidth is the capacity of the pipe — how many bits per second can flow through it. Latency is the time it takes for a bit to traverse the pipe. These are independent dimensions. A 10 Gbps satellite link with 600 ms RTT can transfer large files quickly in throughput terms, but every interactive operation (a DNS lookup, a TLS handshake, a database query) takes at least 600 ms just for the round trip, regardless of the link's bandwidth.

The relationship between latency and TCP throughput is captured by the bandwidth-delay product (BDP): the amount of data "in flight" (unacknowledged, in the network) at any moment. For a TCP connection to fill a pipe, it must have enough data in flight to keep the pipe busy during one RTT:

BDP = Bandwidth × RTT
Example: 100 Mbps link, 100 ms RTT
BDP = 100 × 10⁶ bits/s × 0.1 s = 10 × 10⁶ bits = 10 Mb = 1.25 MB

TCP's congestion window must grow to 1.25 MB before this link is fully utilized. TCP slow start begins at 10–14 KB (Initial Congestion Window, IW10). On a high-BDP path, TCP takes many round trips to ramp up — a 100 MB file transfer over a 100 ms RTT path might see peak throughput only in its last few seconds. This is why TCP congestion control algorithms like CUBIC and BBR are specifically designed to handle high-BDP paths, and why HTTP/3 and QUIC reduce protocol handshake round trips aggressively.

Bufferbloat and AQM

Bufferbloat is the phenomenon where excessively large network buffers cause high latency under load. It was analyzed and named by Jim Gettys around 2010, though the problem had existed for years. The cause is straightforward: home routers, DSL modems, and access equipment were shipped with buffers large enough to hold several seconds of traffic at line rate. Under sustained load, these buffers fill completely, and packets experience hundreds of milliseconds of queuing delay waiting for earlier packets to drain. A network that measures 5 ms idle RTT can measure 500+ ms RTT while uploading a large file.

The traditional solution — increasing buffer sizes — makes throughput stable but makes latency terrible. The correct solution is AQM (Active Queue Management): instead of filling the buffer completely before dropping packets, the router begins proactively dropping or marking packets when the queue starts to grow, signaling TCP senders to slow down before the buffer fills. This keeps buffer occupancy low and latency stable even under load.

CoDel (Controlled Delay, RFC 8289) measures the sojourn time of packets in the queue (how long each packet waits) rather than the queue length, and begins dropping packets when the sojourn time exceeds 5 ms for more than 100 ms. This directly targets the latency objective rather than buffer occupancy as a proxy. FQ-CoDel (Fair Queuing CoDel, RFC 8290) adds a flow-aware fair queuing layer that prevents any single flow from monopolizing the buffer, ensuring that a large bulk download does not add latency to an interactive flow sharing the same link.

FQ-CoDel is the default queue discipline in Linux (available as tc qdisc) and is deployed in OpenWrt on home routers, demonstrating that bufferbloat is solvable with appropriate AQM. The key insight is that you want the buffer to be small enough to drain quickly rather than large enough to never overflow.

Latency Under Load: Bufferbloat vs. FQ-CoDel 0 100 200 300 400 RTT (ms) 0s 5s 10s 15s 20s Time (bulk transfer starts at t=2s) Bufferbloat (large buffer) FQ-CoDel AQM

How CDNs and Anycast Cut Latency

The most effective way to reduce propagation delay is to move content physically closer to users. CDNs (Content Delivery Networks) replicate content to edge servers in hundreds of locations worldwide. When a user in Sydney requests a cached image, it is served from a CDN node in Sydney rather than from an origin server in Virginia — the propagation delay drops from ~200 ms to ~5 ms.

Anycast routing takes this further: a single IP address (like 1.1.1.1 or 8.8.8.8) is announced from multiple geographic locations simultaneously via BGP. Routers automatically send traffic to the topologically closest announcement. A user in Tokyo hits a DNS resolver in Tokyo; a user in London hits one in London. The BGP routing system effectively provides geographic load balancing and latency optimization without any client-side configuration. See how CDNs work for deeper coverage.

Hollow-Core Fiber and Financial Extremes

For high-frequency trading (HFT), microseconds matter — literally. The propagation speed of light in conventional fiber glass is about c×0.66. Hollow-core fiber — fiber with a hollow air or vacuum core rather than solid glass — allows light to travel at approximately c×0.9997, close to the speed of light in a vacuum. The improvement is roughly 47% faster than conventional fiber, shaving microseconds per kilometer. Deployed on the London–Frankfurt corridor and the Chicago–New York route, hollow-core fiber links give HFT firms a measurable edge.

Before hollow-core fiber, HFT firms used microwave radio links (which travel at c×1.0 through air rather than the slower c×0.66 through glass) to transmit market data. The Chicago–New York microwave path is roughly 1,200 km of relay towers, achieving ~8 ms RTT versus ~13 ms on fiber. The latency advantage was worth tens of millions of dollars in infrastructure investment.

Measuring Latency

Several tools provide latency measurement at different layers:

Explore It Live

You can infer network latency from BGP routing data by observing how many AS hops separate two networks and which submarine cables carry the traffic. Look up networks to understand their topology:

See BGP routing data in real time

Open Looking Glass
← Previous How IP Geolocation Works — and Why It Gets It Wrong
More Articles
How DOCSIS Works: Cable Internet Technology Explained
How DSL Works: Internet Over Telephone Lines
How Submarine Cables Work: The Physical Internet
How Rate Limiting Works
How Fiber to the Home (FTTH) Works
How WiFi Works: 802.11 from Radio to Router