How DNS Load Balancing Works: From Round-Robin to Global Server Load Balancing

DNS load balancing is the practice of distributing client traffic across multiple servers or data centers by returning different IP addresses in DNS responses. Unlike traditional load balancers that sit in the data path and forward every packet, DNS-based load balancing operates at the name resolution layer -- it steers clients before they ever open a TCP connection. At its simplest, DNS load balancing is a round-robin rotation of A/AAAA records. At its most sophisticated, it becomes Global Server Load Balancing (GSLB): a system that considers client geography, server health, real-time load, and network conditions to direct each client to the optimal endpoint. Every major internet service -- from Cloudflare (AS13335) to Google (AS15169) to AWS (AS16509) -- uses DNS-based traffic steering as a critical component of their global infrastructure.

DNS Round-Robin: The Simplest Form

The most basic DNS load balancing technique is round-robin: configure multiple A or AAAA records for the same domain name, and the authoritative DNS server rotates the order of records in each response. When a client receives multiple addresses, most DNS resolvers and operating systems use the first address in the list, so rotating the order distributes connections across servers.

; Simple DNS round-robin
example.com.  300  IN  A  198.51.100.10
example.com.  300  IN  A  198.51.100.11
example.com.  300  IN  A  198.51.100.12

DNS round-robin has severe limitations that make it unsuitable as a primary load balancing mechanism for production services:

Despite these limitations, DNS round-robin is still used as a coarse distribution mechanism, often in combination with other techniques. Many CDNs and cloud providers use DNS round-robin to distribute traffic across multiple load balancer VIPs, each of which then performs proper health-checked load balancing.

TTLs and the DNS Caching Problem

Every DNS record has a Time-to-Live (TTL) value that tells recursive resolvers how long to cache the response. TTL is the fundamental constraint on DNS-based load balancing and failover speed. The engineering tradeoff is straightforward:

In practice, many resolvers and client libraries do not strictly honor TTLs. Some resolvers impose minimum TTLs (e.g., 30 seconds) regardless of what the authoritative server specifies. Some clients (particularly Java's InetAddress cache) cache DNS responses indefinitely unless explicitly configured otherwise. Browser DNS caches, operating system caches, and corporate DNS appliances add additional caching layers, each with their own TTL enforcement quirks. This means that even a 30-second TTL does not guarantee failover within 30 seconds -- the actual time depends on the caching behavior of every layer between the client and the authoritative server.

RFC 8767 (Serving Stale Data to Improve DNS Resiliency) explicitly encourages resolvers to serve expired cached records when the authoritative server is unreachable. This is good for resilience but means DNS changes may propagate even more slowly than the TTL suggests.

GeoDNS: Location-Aware Resolution

GeoDNS extends DNS load balancing with geographic intelligence. The authoritative DNS server determines the client's approximate location and returns the IP address of the nearest (or otherwise optimal) server or data center. This is the foundation of GSLB for most internet services.

How GeoDNS Determines Location

The DNS protocol itself does not carry the client's IP address -- the authoritative server only sees the IP address of the recursive resolver that forwarded the query. For large public resolvers like Google Public DNS (8.8.8.8) or Cloudflare DNS (1.1.1.1), the resolver's IP may be thousands of miles from the actual client. This is the fundamental problem that EDNS Client Subnet (ECS) solves.

EDNS Client Subnet (RFC 7871)

ECS is an extension to the DNS protocol that allows recursive resolvers to forward a truncated version of the client's IP address (typically /24 for IPv4, /56 for IPv6) to the authoritative server. The authoritative server can then use this subnet to determine the client's geographic location and return a geographically appropriate response.

EDNS Client Subnet (ECS) for GeoDNS Client 203.0.113.42 Tokyo, JP Recursive Resolver 8.8.8.8 California, US Authoritative NS ns1.example.com GeoDNS-enabled A? cdn.example.com A? cdn.example.com ECS: 203.0.113.0/24 Without ECS: Auth NS sees resolver IP 8.8.8.8 (California) Returns US West server IP -- wrong for Tokyo client With ECS: Auth NS sees client subnet 203.0.113.0/24 (Tokyo) Returns Tokyo data center IP -- correct for client GeoDNS Response Selection 203.0.113.0/24 Tokyo, JP -> 192.0.2.10 (NRT) 198.51.100.0/24 London, UK -> 192.0.2.20 (LHR) 172.16.0.0/16 New York, US -> 192.0.2.30 (EWR) default (fallback) -> 192.0.2.40 (IAD)

ECS introduces a privacy tradeoff: the client's subnet is revealed to the authoritative server and any intermediate resolvers. This has led to debate in the DNS community. Some privacy-focused resolvers (notably Quad9 at 9.9.9.9) do not send ECS by default. The scope prefix length (how much of the client's address is revealed) is negotiable -- sending /24 reveals the client's /24 subnet (256 addresses), while /20 reveals less precision. The authoritative server responds with a "scope" indicating how location-specific its answer is, which determines how the resolver caches the response.

ECS also complicates resolver caching. Without ECS, a resolver caches one answer per domain name. With ECS, the resolver must cache different answers for different client subnets -- a single popular domain can generate thousands of cache entries, one per unique client subnet prefix. This dramatically increases resolver memory usage and is one reason not all resolvers support ECS.

Weighted DNS Records

Beyond simple round-robin, many DNS providers support weighted records that control the proportion of traffic each endpoint receives. AWS Route 53, Google Cloud DNS, and Cloudflare all support this natively:

; Weighted DNS records (Route 53 syntax)
; 70% of resolutions return the primary
example.com.  60  IN  A  198.51.100.10  ; weight: 70
; 30% of resolutions return the secondary
example.com.  60  IN  A  198.51.100.20  ; weight: 30

Weighted records are useful for gradual traffic migration (shift 10% of traffic to a new data center, monitor, increase), A/B testing at the DNS level, and capacity-proportional distribution across data centers with different sizes. Combined with health checks, weights can be dynamically adjusted: when a data center becomes unhealthy, its weight drops to zero and all traffic is steered to remaining healthy endpoints.

The granularity of weighted DNS is limited by caching. With a 60-second TTL and weighted records, the weight ratio is approximately achieved over many resolver cache misses, but any individual resolver caches a single answer for the TTL duration. This means weighted DNS provides statistical distribution over time, not per-request precision.

Health-Check-Driven DNS Failover

The critical improvement of GSLB over basic DNS round-robin is active health checking. A GSLB system continuously monitors the health of each endpoint and removes unhealthy endpoints from DNS responses.

Health check architectures vary by provider:

The fundamental limitation of DNS-based failover remains TTL caching. When the authoritative server removes an unhealthy IP from responses, clients that have already cached the unhealthy IP continue using it until the TTL expires. This is why DNS-based failover is typically combined with other mechanisms -- anycast withdrawal via BGP for L3 failover, or application-layer retries that try alternative addresses from the DNS response.

GSLB Architecture Patterns

Global Server Load Balancing combines GeoDNS, health checking, and traffic policies into a system that optimizes global traffic distribution. Several architectural patterns are common:

Active-Active with Geographic Steering

Multiple data centers actively serve traffic, with GeoDNS directing clients to the nearest one. Health checks continuously monitor each site, and failing sites are removed from DNS. This is the most common GSLB pattern and provides both performance optimization (low latency via geographic proximity) and high availability (automatic failover to remaining sites).

The challenge is data consistency: if users in Tokyo are served by the Tokyo data center and users in London by the London data center, application state must be replicated or partitioned across sites. Databases, session stores, and caches all need multi-region strategies.

Active-Passive with DNS Failover

One data center is designated as primary and serves all traffic under normal conditions. A secondary site is kept in standby. Health checks monitor the primary, and if it fails, DNS records are updated to point to the secondary. This pattern is simpler operationally (no multi-region data replication needed during normal operation) but wastes the secondary site's capacity during normal conditions and has slower failover due to DNS TTL propagation.

Latency-Based Routing

Instead of routing based on geographic proximity, latency-based routing measures actual network latency from each DNS resolver location to each data center and returns the lowest-latency endpoint. AWS Route 53's latency-based routing uses this approach, maintaining a global latency database built from periodic measurements.

Latency-based routing can produce counter-intuitive results. A data center that is geographically farther may have lower latency due to better peering, dedicated fiber, or less congested paths. For example, a user in Mumbai might get lower latency to a Singapore data center than to a closer one in Chennai, depending on the BGP routing and submarine cable topology between those points.

Anycast DNS: Eliminating the TTL Problem

Anycast DNS takes a fundamentally different approach to DNS-based traffic distribution. Instead of returning different IP addresses to steer clients, anycast announces the same IP address from multiple locations via BGP. Internet routing automatically directs each client's DNS query to the topologically nearest anycast instance.

Anycast DNS vs GeoDNS GeoDNS (different IPs per location) US Client gets 198.51.100.10 US Data Center EU Client gets 198.51.100.20 EU Data Center Failover depends on DNS TTL expiration Cached records may point to dead servers Failover time: TTL + propagation (30s - 5min) Anycast (same IP, BGP routing) US Client gets 198.51.100.1 US PoP (BGP) EU Client gets 198.51.100.1 EU PoP (BGP) Failover via BGP route withdrawal IP stays the same, routing changes instantly Failover time: BGP convergence (1-30s) Comparison Property GeoDNS Anycast Failover speed TTL-bounded (30s-5min) BGP convergence (1-30s) Client caching impact Stale IPs cause failures Same IP, routing adapts Routing precision Fine (per subnet with ECS) Coarse (BGP topology) TCP session persistence Stable (IP does not change) Can break during reconvergence Protocol support Any (DNS is protocol-agnostic) Any (routing is protocol-agnostic)

Anycast DNS is the dominant architecture for authoritative DNS services. Cloudflare (AS13335) operates its 1.1.1.1 resolver and all authoritative DNS from anycast nodes in 300+ cities. The root DNS servers (a.root-servers.net through m.root-servers.net) are predominantly anycast -- the "J-root" operated by Verisign has over 200 anycast instances globally. When one instance fails, BGP withdraws its route and traffic seamlessly shifts to the next-nearest instance without any DNS record changes or TTL dependencies.

Anycast for DNS works particularly well because DNS queries are typically single UDP packets -- there is no persistent connection state to break during re-routing. For TCP-based services, anycast failover is more complex because existing TCP connections may be disrupted when routing changes shift traffic to a different instance. This is why anycast is universal for DNS but less common for stateful services that require long-lived connections.

Combining DNS Load Balancing with Other Techniques

In practice, DNS load balancing is rarely used in isolation. It is one layer in a multi-tier traffic management architecture:

  1. DNS GSLB -- GeoDNS or anycast directs clients to the nearest data center or edge location. This is the coarsest level of traffic steering, operating at the data center or region granularity.
  2. Anycast + ECMP -- Within a data center, the service IP is announced via BGP from multiple load balancer instances. ECMP at the top-of-rack switch distributes flows across load balancer instances.
  3. L4/L7 load balancing -- Each load balancer instance (HAProxy, NGINX, Envoy) distributes requests across application server pools with health checking, session persistence, and content-based routing.

This layered approach provides defense in depth: DNS GSLB handles site-level failures, BGP/ECMP handles load balancer failures, and L4/L7 load balancing handles individual server failures. Each layer has its own health checking and failover mechanism, and the combination provides end-to-end resilience.

DNS Load Balancing for Multi-CDN

Large content publishers often use multiple CDN providers simultaneously (multi-CDN) and steer traffic between them using DNS. A DNS-based traffic manager like Citrix Intelligent Traffic Management, NS1, or Cloudflare Load Balancing sits as the authoritative DNS for the content domain and directs each client to the CDN that offers the best performance for that client's location.

Multi-CDN DNS steering considers: real-user performance measurements (RUM beacons from JavaScript embedded in pages), synthetic monitoring from global probes, CDN-reported availability and capacity, cost (steering traffic to the cheapest CDN that meets performance requirements), and contractual commit levels (ensuring minimum traffic volumes to each CDN to meet contract terms).

This is one area where DNS-based load balancing offers capabilities that are difficult to replicate at other layers. Since the DNS decision happens before the client connects, you can steer an entire session to a specific CDN -- something that would be impossible with a packet-level load balancer that cannot see the CDN selection decision.

Challenges and Failure Modes

DNS load balancing has several well-known failure modes that network engineers must account for:

Real-World DNS Load Balancing Implementations

Major services use DNS load balancing in distinct ways:

DNS Load Balancing and BGP

BGP and DNS-based load balancing are the two primary mechanisms for global traffic distribution, and they serve complementary roles. BGP operates at the network layer, directing IP packets to the topologically closest announcement point. DNS operates at the application layer, directing clients to the IP address that the GSLB system deems optimal.

In sophisticated deployments, the two systems are tightly integrated. A GSLB controller monitors BGP routing tables to understand network topology and uses this information to make better DNS steering decisions. Conversely, when a BGP route is withdrawn (due to a site failure or maintenance), the GSLB system detects the loss of reachability and stops returning that site's IPs in DNS responses. This bidirectional integration ensures that DNS steering decisions are consistent with actual network reachability.

You can observe the interaction between DNS and BGP by looking up any major service on the god.ad BGP Looking Glass. The BGP routes show you which networks announce each service's IP prefixes, and the AS paths reveal the routing topology that determines where DNS-steered traffic actually flows.

See BGP routing data in real time

Open Looking Glass
More Articles
How DNS over QUIC (DoQ) Works: Encrypted DNS Without Head-of-Line Blocking
How DNS over TLS (DoT) Works: Encrypting DNS on Port 853
What is BGP? The Internet's Routing Protocol Explained
What is an Autonomous System (AS)?
What is a BGP Looking Glass?
How to Look Up an IP Address's BGP Route