How HTTP Caching Works: Cache-Control, ETags, and CDN Edge Caching

HTTP caching is the mechanism by which browsers, proxies, and CDN edge servers store copies of HTTP responses and serve them to subsequent requests without contacting the origin server. Caching is what makes the web fast. Without it, every page load, every image, every API response would require a full round trip to the origin server — adding latency, consuming bandwidth, and increasing server load. The HTTP caching model, defined in RFC 9111 (which supersedes the caching sections of RFC 7234 and RFC 2616), provides a rich set of directives that give origin servers fine-grained control over what gets cached, for how long, by whom, and under what conditions responses can be served stale.

The Cache-Control Header

Cache-Control is the primary mechanism for controlling HTTP caching behavior. It appears in both responses (from the server) and requests (from the client), though response directives are far more commonly used. Each directive controls a specific aspect of caching behavior:

Response Directives

# Cache for 1 hour in all caches (browsers, CDNs, proxies)
Cache-Control: public, max-age=3600

# Cache only in the browser, not in shared caches (CDNs/proxies)
Cache-Control: private, max-age=3600

# CDN caches for 1 day, browser caches for 5 minutes
Cache-Control: public, max-age=300, s-maxage=86400

# Do not cache at all
Cache-Control: no-store

# Cache but always revalidate before using
Cache-Control: no-cache

# Cache, but serve stale while revalidating in background
Cache-Control: public, max-age=600, stale-while-revalidate=300

# Immutable content (never revalidate, even on reload)
Cache-Control: public, max-age=31536000, immutable

The directives break down as follows:

max-age=N — the response is fresh for N seconds from the time it was generated. After this period, the cached response is stale and must be revalidated (or a new response fetched).
s-maxage=N — overrides max-age for shared caches (CDNs, reverse proxies) only. This lets you set a short browser cache (users see updates quickly) with a long CDN cache (reducing origin load). This is the most important directive for CDN-based architectures.
public — the response can be stored by any cache, including shared caches. This is the default for responses without Authorization headers.
private — the response is intended for a single user and must not be stored by shared caches. The browser can still cache it. Use this for personalized content (user profiles, account pages, email content).
no-cache — the response can be stored, but must be revalidated with the origin server before every use. Despite the name, this does not prevent caching — it prevents serving cached content without checking freshness first.
no-store — the response must not be stored in any cache. Period. Use this for sensitive data (banking, medical, authentication tokens). Note that this is an instruction, not a guarantee — a compromised or malicious intermediary can ignore it.
must-revalidate — once a cached response becomes stale, the cache must not use it without revalidation. Without this, caches may serve stale content when the origin is unreachable (for example, during an outage).
stale-while-revalidate=N — the cache may serve a stale response for up to N seconds while it revalidates in the background. This provides instant responses to users while ensuring the cache is eventually updated. Introduced in RFC 5861, now widely supported by CDNs and browsers.
stale-if-error=N — the cache may serve a stale response for up to N seconds if the origin returns a 5xx error or is unreachable. This is a resilience mechanism — users get stale but functional content instead of error pages.
immutable — tells the browser that the response body will never change for this URL. Without immutable, browsers revalidate cached resources on page reload even if max-age has not expired. With immutable, the browser skips revalidation entirely, eliminating conditional requests on reload. This is ideal for fingerprinted assets (app.a1b2c3d4.js) that have unique URLs per version.

ETag and Conditional Requests

When a cached response becomes stale, the cache does not necessarily need to download the full response again. Conditional requests allow the cache to ask the origin server: "Has this resource changed since I last fetched it?" If not, the server responds with 304 Not Modified (no body), and the cache updates the freshness metadata without transferring the response body.

Two validation mechanisms exist:

ETag / If-None-Match (Strong Validation)

An ETag (entity tag) is an opaque string that uniquely identifies a specific version of a resource. It is typically a content hash or a version identifier. The server sends it in the response:

HTTP/1.1 200 OK
Content-Type: application/json
Cache-Control: public, max-age=60
ETag: "a1b2c3d4e5f6"

{"asn":13335,"name":"CLOUDFLARENET","prefixes":1842}

When the cached response becomes stale, the client sends a conditional request with If-None-Match:

GET /api/as/13335 HTTP/1.1
If-None-Match: "a1b2c3d4e5f6"

If the resource has not changed (the current ETag matches), the server responds with:

HTTP/1.1 304 Not Modified
ETag: "a1b2c3d4e5f6"
Cache-Control: public, max-age=60

No body is transmitted. The cache marks its stored response as fresh for another 60 seconds. If the resource has changed, the server sends a full 200 OK response with the new content and a new ETag.

ETags can be strong (byte-for-byte identical: "a1b2c3d4") or weak (semantically equivalent: W/"a1b2c3d4"). Weak ETags indicate that the content is equivalent but not necessarily identical — useful when insignificant variations (whitespace, comment timestamps) should not invalidate the cache.

Last-Modified / If-Modified-Since (Weak Validation)

The Last-Modified header provides a timestamp-based validation mechanism. It is simpler but less precise than ETags — if a resource changes twice within the same second, the Last-Modified timestamp cannot distinguish between the two versions.

HTTP/1.1 200 OK
Last-Modified: Wed, 24 Apr 2026 12:00:00 GMT
Cache-Control: public, max-age=60

# Revalidation request:
GET /api/as/13335 HTTP/1.1
If-Modified-Since: Wed, 24 Apr 2026 12:00:00 GMT

# If unchanged:
HTTP/1.1 304 Not Modified

When both ETag and Last-Modified are present, caches should use ETag for validation (it is the stronger validator). Most CDNs and browsers follow this behavior.

The Vary Header

The Vary header tells caches that the response varies based on specific request headers. Without Vary, a cache stores one response per URL. With Vary, the cache stores separate responses for each unique combination of the specified header values.

# Response varies based on Accept-Encoding (gzip vs brotli vs identity)
Vary: Accept-Encoding

# Response varies based on Accept-Language
Vary: Accept-Language

# Response varies based on both (separate cache entries for each combination)
Vary: Accept-Encoding, Accept-Language

The most common use of Vary is Vary: Accept-Encoding, which tells caches to store separate compressed and uncompressed versions of a response. Without this, a cache might serve a gzip-compressed response to a client that does not support gzip, or vice versa.

Vary: Accept is important for content-negotiated APIs that serve JSON or HTML depending on the Accept header. Vary: Cookie or Vary: Authorization effectively makes the response uncacheable in shared caches, since these headers differ per user.

A common CDN pitfall: Vary: User-Agent creates a separate cache entry for every unique User-Agent string. Since there are thousands of unique User-Agent strings in the wild, this effectively disables caching. Some CDNs (Cloudflare, Fastly) normalize User-Agent into device classes (mobile, desktop, tablet) to make Vary: User-Agent practical.

CDN Edge Caching

Content Delivery Networks place cache servers at hundreds of locations worldwide, serving cached content from the edge server closest to the user. The CDN caching model builds on HTTP caching but adds additional layers:

Edge caching behavior is controlled primarily by s-maxage and CDN-specific headers. Each CDN also provides proprietary extensions:

Cloudflare — CDN-Cache-Control (takes precedence over Cache-Control for Cloudflare's edge), cf-cache-status response header (HIT, MISS, EXPIRED, REVALIDATED, DYNAMIC)
Fastly/Varnish — Surrogate-Control header for edge-specific directives, Surrogate-Key for tag-based purging
AWS CloudFront — behaviors configured per path pattern, with minimum/maximum/default TTLs that can override Cache-Control
Akamai — Edge-Control header, comprehensive property manager rules

The CDN-Cache-Control header (standardized as the CDN-Cache-Control targeted field in RFC 9213) allows origin servers to send caching directives specifically for CDNs that are ignored by browsers. This solves the problem of wanting different cache durations at the edge vs. the browser:

# Browser: cache for 60 seconds, CDN: cache for 1 day
Cache-Control: max-age=60
CDN-Cache-Control: max-age=86400

Cache Invalidation

Phil Karlton's famous quote — "There are only two hard things in Computer Science: cache invalidation and naming things" — remains painfully true. HTTP caching provides no standard mechanism for proactive invalidation. The Cache-Control model is fundamentally based on expiration: the origin sets a TTL, and caches serve stale content until the TTL expires. Explicit invalidation is not part of the HTTP caching specification.

In practice, several strategies exist:

URL fingerprinting is the most reliable invalidation strategy. By embedding a content hash or version in the URL (app.a1b2c3d4.js, /api/v2/data), you create a new URL for every new version. The old URL's cache entry becomes irrelevant because no client requests it. This is why modern build tools generate fingerprinted asset filenames. Combined with Cache-Control: public, max-age=31536000, immutable, fingerprinted URLs can be cached forever.

CDN purge APIs provide explicit invalidation at the CDN layer. Every major CDN offers APIs to purge specific URLs, URL patterns, or cache tags:

# Cloudflare: purge specific URLs
curl -X POST "https://api.cloudflare.com/client/v4/zones/{zone}/purge_cache" \
  -H "Authorization: Bearer {token}" \
  -d '{"files":["https://god.ad/api/as/13335"]}'

# Fastly: purge by surrogate key (tag)
curl -X POST "https://api.fastly.com/service/{id}/purge/as-13335" \
  -H "Fastly-Key: {token}"

Tag-based purging (Fastly's Surrogate-Key, Cloudflare's Cache-Tag) is the most powerful CDN invalidation mechanism. The origin tags responses with identifiers representing the data they contain. When data changes, the origin purges all responses tagged with that identifier. For a BGP looking glass, a route change for AS 13335 could purge all cached responses tagged as-13335, regardless of URL.

Short TTLs with stale-while-revalidate provide a middle ground. By setting max-age=60, stale-while-revalidate=3600, users always get instant responses (from cache), while the cache updates in the background within 60 seconds of a change. This trades perfect consistency for performance.

Caching API Responses

API responses present unique caching challenges compared to static assets. The same URL may return different content based on authentication, query parameters, request headers, or server-side state that changes unpredictably.

Effective API caching strategies:

GET-only caching — Only GET requests should be cached. POST, PUT, DELETE, and PATCH are not cacheable by default (and should not be, since they have side effects). A successful POST/PUT/DELETE should invalidate related cached GET responses.
Vary-aware caching — Use Vary: Accept for content-negotiated APIs and Vary: Authorization for per-user responses. Be aware that Vary: Authorization effectively disables shared caching.
Conditional requests for dynamic data — For data that changes unpredictably (BGP route data, stock prices), use Cache-Control: no-cache with an ETag. Every request revalidates with the server, but if the data has not changed, only the 304 header is transmitted instead of the full response.
Short TTL for semi-static data — For data that changes infrequently (AS metadata, GeoIP data), use Cache-Control: public, max-age=300 to absorb identical requests within a 5-minute window.

Browser Cache Behavior

Browser caching is more nuanced than the specification suggests. Different types of navigation trigger different caching behavior:

Normal navigation (clicking a link, typing a URL) — the browser uses cached responses that are still fresh according to max-age. No network requests are made for fresh resources.
Reload (F5 / Cmd+R) — the browser revalidates all resources, sending conditional requests (If-None-Match / If-Modified-Since) even if they are still fresh. Without the immutable directive, this generates unnecessary 304 responses for unchanged resources.
Hard reload (Ctrl+Shift+R / Cmd+Shift+R) — the browser bypasses the cache entirely, sending requests without conditional headers. All resources are downloaded from scratch.
Back/forward navigation — browsers aggressively use the bfcache (back-forward cache), serving the entire page from memory without any network requests, regardless of Cache-Control headers.

Browsers also implement heuristic caching: when a response has no Cache-Control or Expires header but does have a Last-Modified header, browsers cache the response with an implicit max-age of 10% of the time since the resource was last modified. This means a resource last modified 100 days ago gets a heuristic max-age of 10 days. This can cause surprising staleness for resources where the server forgot to set explicit cache headers.

Cache Busting and Versioning Strategies

For static assets (JavaScript, CSS, images, fonts), the standard approach is a two-tier strategy:

HTML pages: Cache-Control: no-cache (always revalidate) or very short max-age. HTML is the entry point that references other assets; it must always point to current asset versions.
Fingerprinted assets: Cache-Control: public, max-age=31536000, immutable. Since the filename changes when the content changes, caching forever is safe. Build tools (webpack, Vite, esbuild) generate these filenames automatically.

This pattern ensures users always get the latest HTML (with fresh asset references) while benefiting from permanent caching of unchanged assets. The initial HTML load revalidates with a 304 most of the time, while all referenced assets are served from cache with zero network overhead.

For single-page applications like the one at god.ad, the HTML shell is served with short cache durations while the embedded JavaScript and CSS (included directly in the HTML in this case) benefit from the HTML's own versioning through deployment-time updates.

Caching and DNS

DNS has its own caching layer (TTLs on DNS records) that interacts with HTTP caching in important ways. When you update DNS to point to a new server (during a migration or failover), clients with cached DNS records continue to connect to the old server. If the old server is down, users experience failures even if the new server is healthy. CDNs mitigate this by providing stable anycast IPs that do not change during infrastructure changes — the CDN handles routing to the correct origin internally.

For DNS-based load balancing, short DNS TTLs (30-60 seconds) conflict with HTTP caching: even if the HTTP cache is fresh, the client may need to re-resolve DNS for the next request. The interaction between DNS TTLs, HTTP cache TTLs, and CDN edge TTLs creates a complex system where the effective cache duration is the minimum of all three.

Security Implications

Caching introduces several security considerations:

Cache poisoning — If an attacker can inject malicious content into a cache (by exploiting request smuggling, host header injection, or unkeyed request headers), that malicious content is served to all subsequent users. This is particularly dangerous with CDN caches that serve thousands of users.
Sensitive data leakage — Shared caches (CDNs, corporate proxies) can accidentally cache personalized or authenticated content if private or no-store directives are missing. A response with Cache-Control: public containing user-specific data is a data leak waiting to happen.
Web cache deception — An attacker tricks a victim into visiting a URL like /account/settings/style.css. The origin ignores the .css extension and serves the account page (with sensitive data). The CDN, seeing the .css extension, caches the response as a static asset. The attacker then requests the same URL and receives the victim's cached account page.
Cache-timing attacks — Measuring whether a resource is served from cache (fast) or fetched from the network (slow) can reveal whether another user has visited a specific URL. The SameSite cookie attribute and partitioned caches (implemented in modern browsers) mitigate this.

Summary

HTTP caching is a multi-layered system that spans browsers, CDN edges, reverse proxies, and origin servers. Cache-Control directives provide fine-grained control over freshness, storage, and revalidation. ETags enable efficient conditional requests that avoid re-transferring unchanged content. The Vary header handles content negotiation by maintaining separate cache entries per header combination. stale-while-revalidate provides the best of both worlds: instant cached responses with background freshness updates.

CDN edge caching amplifies these benefits by placing cached content close to users worldwide, but introduces complexity around cache invalidation, tag-based purging, and CDN-specific control headers. URL fingerprinting remains the most reliable invalidation strategy, and the immutable directive eliminates unnecessary revalidation on page reload.

The god.ad BGP Looking Glass uses Cloudflare CDN caching with appropriate cache headers for its static assets and edge-cached dynamic responses, including generated OG images and API lookups. You can inspect the caching headers on any response using your browser's developer tools or the cf-cache-status header in the response.