How Nginx Works: Architecture and Internals

Nginx is not just another web server. It was built from scratch to solve a problem that Apache's process-per-connection model could not: handling tens of thousands of concurrent connections on a single machine. Igor Sysoev started Nginx in 2002 specifically to tackle the C10K problem -- serving 10,000 simultaneous connections -- and the architecture he chose (event-driven, non-blocking, single-threaded worker processes) became the template that every high-performance proxy and web server has followed since. Today, Nginx serves roughly a third of all websites and is the dominant reverse proxy, load balancer, and TLS terminator in production infrastructure worldwide.

This article dissects how Nginx actually works: its process model, event loop, request processing phases, reverse proxy mechanics, upstream load balancing, caching layer, and TLS termination. If you operate Nginx in production or are evaluating it against alternatives like HAProxy, Envoy, or Caddy, this is the reference you need.

Process Architecture: Master and Workers

Nginx uses a multi-process architecture with a strict separation of concerns. When you start Nginx, a single master process spawns one or more worker processes. The master never handles client traffic directly. Its responsibilities are limited to reading and validating configuration, binding to listen sockets, managing worker lifecycle (spawning, signaling, graceful shutdown), and performing privileged operations like binding to port 80 that workers (which drop privileges) cannot do.

Each worker process is an independent, single-threaded event loop. Workers inherit the listening sockets from the master via fork(), and each worker independently accepts new connections on those sockets. There is no inter-worker communication for request handling -- each worker is self-contained. This eliminates the need for locks, mutexes, or shared memory for the hot path (request processing), which is a major reason Nginx scales so well on multi-core machines.

The typical configuration is one worker per CPU core:

worker_processes auto;  # one per core
events {
    worker_connections 16384;
    use epoll;           # Linux; kqueue on FreeBSD/macOS
    multi_accept on;
}

With worker_connections 16384 and 8 workers, a single Nginx instance can maintain 131,072 simultaneous connections. In reverse proxy mode, each client connection also requires an upstream connection, so the effective limit is halved for proxied requests -- but that is still 65,536 concurrent proxied requests on a single machine, far beyond what thread-per-connection architectures can sustain.

Accept Mutex and Socket Sharding

When multiple workers share the same listening socket, a classic problem arises: the thundering herd. When a new connection arrives, the kernel wakes all workers blocked on epoll_wait() for that socket, but only one can accept the connection. The rest wake up for nothing, wasting CPU cycles.

Nginx historically solved this with the accept_mutex directive: a shared lock that ensures only one worker at a time listens for new connections on each socket. The worker holding the mutex accepts a batch of connections, then releases it for the next worker. This works but creates slight unfairness -- the worker that holds the mutex most often gets more connections.

Modern kernels and Nginx versions solve this more elegantly with SO_REUSEPORT (Linux 3.9+), which creates a separate listen socket per worker in the kernel. The kernel itself distributes incoming connections across these sockets, eliminating both the thundering herd and the need for a userspace mutex. Enable it with:

server {
    listen 80 reuseport;
    listen 443 ssl reuseport;
}

The reuseport option consistently delivers 2-3x better throughput under high connection rates compared to the accept mutex approach, because workers never contend with each other.

The Event Loop: How a Single Thread Handles Thousands of Connections

Each Nginx worker runs a single event loop that multiplexes I/O across all its connections using the operating system's most efficient mechanism: epoll on Linux, kqueue on FreeBSD and macOS. The fundamental insight is that network I/O is overwhelmingly waiting -- waiting for data to arrive, waiting for send buffers to drain, waiting for upstream responses. A thread-per-connection model wastes a thread for each wait. Nginx's event-driven model lets one thread manage all the waits simultaneously.

The core loop is conceptually simple:

while (running) {
    events = epoll_wait(epfd, event_list, max_events, timeout);
    for each event in events {
        if event is a new connection:
            accept() and register with epoll
        if event is readable:
            read data, advance request state machine
        if event is writable:
            write buffered response data
        if event is a timer:
            handle timeout (close idle connection, retry upstream, etc.)
    }
    process_posted_events();
    expire_timers();
}

Every I/O operation in Nginx is non-blocking. When Nginx calls read() on a socket, it gets whatever data is available in the kernel buffer right now and returns immediately -- it never blocks waiting for more. If no data is available, the call returns EAGAIN, and Nginx goes back to processing other events. When more data arrives, the kernel signals the event loop via epoll, and Nginx picks up where it left off.

This is fundamentally different from how Apache's prefork MPM works. In Apache, each connection gets a dedicated process (or thread in the worker MPM). That process blocks on read(), waiting for the client to send data. While blocked, it consumes a full thread's stack (typically 8 MB on Linux), OS scheduler slots, and context-switch overhead. At 10,000 connections, that is 80 GB of stack space alone -- clearly unworkable. Nginx's event-driven worker handles those same 10,000 connections in a single thread with a few hundred MB of total memory, because each connection is just a small state machine structure (the ngx_connection_t struct is roughly 232 bytes on 64-bit systems).

Timer Management

Nginx maintains a red-black tree of timers for connection timeouts, keepalive timeouts, upstream response timeouts, and cache expiration. After processing I/O events, the event loop checks the timer tree and fires any expired timers. Timers are how Nginx enforces directives like client_body_timeout, proxy_read_timeout, and keepalive_timeout. They are also how Nginx detects unresponsive upstreams and triggers failover.

Request Processing Phases

When Nginx receives an HTTP request, it does not simply hand it to a handler. Instead, the request passes through a series of processing phases, each consisting of one or more registered handlers that execute in order. This phase-based architecture is what makes Nginx's module system so powerful -- modules hook into specific phases without needing to understand the rest of the pipeline.

The phases, in execution order, are:

POST_READ -- Runs immediately after the full request headers are read. The realip module hooks here to replace the client IP with an X-Forwarded-For or X-Real-IP header value.
SERVER_REWRITE -- Executes rewrite directives defined at the server {} block level, before location matching.
FIND_CONFIG -- The location matching phase. Nginx evaluates all location blocks in the selected server {} block and finds the best match for the request URI. This phase is internal; no modules hook into it.
REWRITE -- Executes rewrite directives within the matched location {} block. If a rewrite changes the URI, Nginx loops back to FIND_CONFIG (up to 10 times by default, to prevent infinite loops).
POST_REWRITE -- Internal phase that handles the loop-back to FIND_CONFIG after a rewrite.
PREACCESS -- Rate limiting (limit_req), connection limiting (limit_conn), and similar pre-authorization checks run here.
ACCESS -- Authorization checks: allow/deny IP ACLs, auth_basic, auth_request (subrequest-based auth).
POST_ACCESS -- Internal phase that processes the result of ACCESS phase handlers (implements the satisfy directive logic).
PRECONTENT -- The try_files directive runs here, checking for file existence before delegating to a content handler.
CONTENT -- The main content generation phase. Only one content handler runs: proxy_pass, fastcgi_pass, static file serving, return, etc. The first handler that claims the request wins.
LOG -- Runs after the response is sent. The access_log directive hooks here.

Understanding these phases is critical for debugging complex Nginx configurations. A common mistake is placing an auth_basic directive in a location that also has try_files and wondering why authentication seems to be bypassed -- the issue is usually that try_files (PRECONTENT) causes an internal redirect to a different location that lacks the auth_basic directive.

Location Matching: How Nginx Routes Requests

Location matching is one of the most misunderstood aspects of Nginx. The matching algorithm uses a priority system, not first-match:

Exact match (= /path) -- If the URI matches exactly, Nginx uses this location immediately and stops searching. This is the fastest lookup.
Preferential prefix (^~ /path) -- If a prefix location marked with ^~ matches, Nginx uses it without checking regex locations.
Regex match (~ /pattern or ~* /pattern) -- Regex locations are evaluated in the order they appear in the config file. The first match wins.
Longest prefix match (/path) -- If no regex matches, Nginx uses the longest matching prefix location.

This means that a regex location can override a longer prefix match, which surprises many engineers:

location /api/v2/users {
    # Longest prefix match -- but will NOT be used if a regex matches
    proxy_pass http://users_service;
}

location ~ ^/api/ {
    # This regex matches /api/v2/users too -- and it wins!
    proxy_pass http://generic_api;
}

The fix is either to use ^~ on the prefix location or to reorder the regex locations carefully. In production, excessive use of regex locations is a common source of routing bugs and performance issues (each regex must be evaluated sequentially until one matches).

Reverse Proxy and Upstream Architecture

Nginx's most important production role is as a reverse proxy: it terminates client connections and forwards requests to backend servers (called "upstreams"). This is a full Layer 7 load balancer architecture -- Nginx parses the complete HTTP request, makes a routing decision, opens (or reuses) a connection to a backend, forwards the request, receives the response, and sends it back to the client.

The basic configuration is straightforward:

upstream backend {
    server 10.0.1.10:8080;
    server 10.0.1.11:8080;
    server 10.0.1.12:8080;
}

server {
    listen 443 ssl;
    server_name api.example.com;

    location / {
        proxy_pass http://backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

But the simplicity of the config belies significant complexity in how Nginx handles the proxy lifecycle. When a request hits proxy_pass, Nginx:

Selects an upstream server using the configured load balancing algorithm
Checks the connection pool for a reusable keepalive connection to that server
If no keepalive connection is available, opens a new TCP connection (and optionally performs a TLS handshake for https:// upstreams)
Forwards the request headers and body to the upstream
Reads the upstream response, buffering it according to proxy_buffering settings
Sends the response to the client
If the upstream indicated Connection: keep-alive, returns the connection to the pool; otherwise, closes it

Upstream Keepalive Connections

Without keepalive connections, every proxied request pays the cost of a TCP handshake (and TLS handshake for HTTPS upstreams) to the backend. For high-throughput services, this is catastrophically wasteful. Nginx maintains a per-worker connection pool to each upstream:

upstream backend {
    server 10.0.1.10:8080;
    server 10.0.1.11:8080;
    keepalive 64;            # keep up to 64 idle connections per worker
    keepalive_timeout 60s;   # close idle connections after 60s
    keepalive_requests 1000; # reuse each connection for up to 1000 requests
}

location / {
    proxy_pass http://backend;
    proxy_http_version 1.1;             # required for keepalive
    proxy_set_header Connection "";      # clear the Connection header
}

The proxy_http_version 1.1 and proxy_set_header Connection "" lines are critical. HTTP/1.0 defaults to closing connections after each request, and Nginx defaults to HTTP/1.0 for upstream connections. You must explicitly set HTTP/1.1 and clear the Connection header to enable keepalive. This is one of the most common Nginx misconfigurations -- omitting these lines means every request to the backend incurs a full TCP handshake, sometimes adding 1-2ms of latency per request.

For services using HTTP/2 between Nginx and upstreams (available in Nginx 1.25.1+), the benefits are even greater because HTTP/2 multiplexes many concurrent requests over a single TCP connection:

upstream backend_h2 {
    server 10.0.1.10:8443;
    keepalive 32;
}

location / {
    proxy_pass https://backend_h2;
    proxy_http_version 2;
}

Load Balancing Algorithms

Nginx supports several upstream load balancing strategies:

Round-robin (default) -- Distributes requests evenly across all upstream servers in order. Weighted round-robin is supported via the weight parameter: server 10.0.1.10:8080 weight=3; sends three times as many requests to that server.
Least connections (least_conn) -- Sends each request to the server with the fewest active connections. Better than round-robin when request durations vary significantly.
IP hash (ip_hash) -- Hashes the client's IP address to consistently route the same client to the same backend. Provides session affinity but breaks when clients are behind a NAT or CDN (many clients share one IP).
Generic hash (hash) -- Hashes an arbitrary key (URL, cookie, header) for consistent routing: hash $request_uri consistent;. The consistent parameter uses ketama consistent hashing, which minimizes redistribution when servers are added or removed.
Random with two choices (random two least_conn) -- Picks two servers at random and sends the request to the one with fewer connections. This "power of two choices" algorithm provides near-optimal load distribution with minimal coordination, and is the recommended algorithm for large upstream pools.

Health Checks and Failover

Open-source Nginx performs passive health checks only. When a request to an upstream server fails (connection error, timeout, or a response matching proxy_next_upstream conditions), Nginx marks the server as unavailable for fail_timeout seconds after max_fails consecutive failures:

upstream backend {
    server 10.0.1.10:8080 max_fails=3 fail_timeout=30s;
    server 10.0.1.11:8080 max_fails=3 fail_timeout=30s;
    server 10.0.1.12:8080 backup;  # only used when all primary servers are down
}

The proxy_next_upstream directive controls which failure types trigger failover to the next server:

proxy_next_upstream error timeout http_502 http_503 http_504;
proxy_next_upstream_tries 3;     # try at most 3 servers
proxy_next_upstream_timeout 10s; # give up retrying after 10s total

A subtle but important detail: proxy_next_upstream retries are only safe for idempotent requests. Retrying a non-idempotent POST to a different upstream can cause duplicate operations. Nginx defaults to retrying on error and timeout for all methods, including POST -- something to be aware of in API gateway configurations.

Response Buffering

Buffering is one of Nginx's most important features as a reverse proxy, and one of the least understood. When proxy_buffering on (the default), Nginx reads the entire upstream response into memory (or temporary files if the response exceeds proxy_buffer_size) before sending it to the client. This has a critical benefit: it frees the upstream connection as quickly as possible.

Consider a scenario where the backend generates a 10 MB response in 50ms, but the client is on a slow 3G connection that takes 30 seconds to download 10 MB. Without buffering, the backend connection is held open for the entire 30 seconds. With buffering, Nginx absorbs the response in 50ms, releases the upstream connection, and drip-feeds the data to the slow client from its own buffers. The backend served one fast request; Nginx handles the slow client.

This is why Nginx is so effective at protecting backends from slow clients -- a pattern sometimes called request/response shielding. The relevant directives:

proxy_buffering on;
proxy_buffer_size 4k;          # buffer for the first part of the response (headers)
proxy_buffers 8 8k;            # 8 buffers of 8k each for the response body
proxy_busy_buffers_size 16k;   # how much can be sent to client while still reading from upstream
proxy_temp_file_write_size 16k;
proxy_max_temp_file_size 1024m; # max temp file size for very large responses

Disable buffering only when you need to stream responses in real-time (Server-Sent Events, chunked streaming, WebSocket-like long-poll patterns):

location /events {
    proxy_pass http://backend;
    proxy_buffering off;
    proxy_cache off;
}

Caching

Nginx includes a full HTTP cache that can serve responses from disk (or memory-mapped files) without contacting the upstream at all. For read-heavy workloads, this is the single most impactful performance optimization available: a cached response is served in microseconds from the local filesystem, compared to milliseconds for a proxied request.

The cache is configured in two parts -- defining the cache zone, and enabling it per-location:

# In http {} block: define a shared memory zone and disk path
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=app_cache:64m
                 max_size=10g inactive=60m use_temp_path=off;

# In location {} block: enable caching
location /api/ {
    proxy_pass http://backend;
    proxy_cache app_cache;
    proxy_cache_valid 200 10m;          # cache 200 responses for 10 minutes
    proxy_cache_valid 404 1m;           # cache 404 responses for 1 minute
    proxy_cache_key $scheme$host$request_uri;
    proxy_cache_use_stale error timeout updating http_500 http_502 http_503;
    proxy_cache_lock on;                # only one request populates the cache
    proxy_cache_lock_timeout 5s;
    add_header X-Cache-Status $upstream_cache_status;
}

Several of these directives deserve deeper explanation:

proxy_cache_use_stale -- This is arguably the most important caching directive. It tells Nginx to serve stale (expired) cached content when the upstream is unavailable (error, timeout, 5xx). This transforms Nginx from a simple cache into a resilience layer: even if your backend goes down completely, clients continue to receive cached responses. This is the same principle that CDNs use for origin shielding.
proxy_cache_lock -- Without this, a cache miss on a popular URL under heavy traffic causes a thundering herd to the upstream: thousands of requests all miss the cache simultaneously and all proxy to the backend. With proxy_cache_lock on, only the first request for a given cache key goes to the upstream; all subsequent requests for the same key wait for the first to complete and then are served from cache.
keys_zone -- The shared memory zone stores cache metadata (keys and expiration times). 1 MB of keys_zone stores roughly 8,000 keys. The actual cached response bodies are stored on disk at the proxy_cache_path.

Nginx's cache also respects Cache-Control headers from upstreams. If the backend sends Cache-Control: no-store, Nginx will not cache the response (unless you override with proxy_ignore_headers Cache-Control). The Vary header is also respected, creating separate cache entries for different header values -- important for content negotiation.

TLS Termination

Nginx is the TLS termination point for a huge fraction of the internet's HTTPS traffic. TLS termination means Nginx handles the TLS handshake with the client, decrypts incoming data, and forwards plaintext HTTP to the backend. This offloads the computationally expensive cryptographic operations from application servers.

A production TLS configuration:

server {
    listen 443 ssl http2;
    server_name example.com;

    ssl_certificate     /etc/nginx/ssl/example.com.pem;
    ssl_certificate_key /etc/nginx/ssl/example.com.key;

    # TLS 1.2 and 1.3 only -- disable older versions
    ssl_protocols TLSv1.2 TLSv1.3;

    # Prefer server cipher order for TLS 1.2
    ssl_prefer_server_ciphers on;
    ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384;

    # Session caching -- avoids full handshake on reconnection
    ssl_session_cache shared:SSL:50m;
    ssl_session_timeout 1d;
    ssl_session_tickets on;

    # OCSP Stapling -- avoids client-side OCSP lookups
    ssl_stapling on;
    ssl_stapling_verify on;
    resolver 8.8.8.8 1.1.1.1 valid=300s;
    resolver_timeout 5s;

    # HSTS
    add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload" always;
}

TLS Performance Considerations

The TLS handshake is the most CPU-intensive part of Nginx's work. A full TLS 1.2 handshake requires the server to perform an RSA or ECDSA signature operation and an ECDH key exchange -- operations that take hundreds of microseconds to a few milliseconds each. On a modern server, Nginx can perform roughly 10,000-30,000 full TLS handshakes per second per core (depending on key type and size).

Several mechanisms reduce TLS overhead:

Session resumption -- ssl_session_cache stores TLS session parameters so returning clients can resume a previous session without a full handshake. This is critical for performance: a resumed session skips the expensive asymmetric cryptography entirely.
TLS 1.3 -- Reduces the handshake to a single round trip (1-RTT) and supports 0-RTT resumption. With QUIC/HTTP3, the transport and TLS handshakes are combined for even faster connection setup.
ECDSA certificates -- ECDSA signing is 5-10x faster than RSA 2048-bit signing. Switching from RSA to ECDSA P-256 certificates can substantially increase TLS handshake throughput.
OCSP stapling -- Without stapling, the client must make a separate OCSP request to the certificate authority to check revocation status, adding 50-200ms of latency. Stapling attaches the OCSP response to the TLS handshake, eliminating this round trip.

SSL Session Ticket Keys and Forward Secrecy

A subtle security consideration: ssl_session_tickets encrypts session state with a symmetric key stored in memory. If this key is compromised, an attacker can decrypt all sessions that were encrypted with it. Nginx generates a random ticket key at startup by default, but in a multi-server deployment, you need to share ticket keys across servers for session resumption to work cross-server -- and you need to rotate those keys regularly (every few hours) to limit exposure. Alternatively, you can disable session tickets and rely solely on server-side session caches, which provide forward secrecy at the cost of more memory usage.

Connection Handling for HTTP/2 and WebSocket

Modern Nginx handles HTTP/2 natively on the client side. When a client negotiates HTTP/2 via ALPN during the TLS handshake, Nginx speaks HTTP/2 on the client connection while typically proxying to backends over HTTP/1.1 (or HTTP/2 since Nginx 1.25.1). HTTP/2 multiplexing means the client sends many concurrent requests over a single TCP connection. Nginx demultiplexes these into individual requests and proxies them independently to upstreams.

This multiplexing has an important performance implication: with HTTP/1.1, browsers open 6-8 parallel connections per origin. With HTTP/2, the browser uses one connection. This means Nginx handles fewer connections but each carries more concurrent streams. The http2_max_concurrent_streams directive (default 128) controls how many parallel requests a single HTTP/2 connection can carry.

WebSocket proxying requires explicit support because WebSocket upgrades an HTTP connection to a persistent, bidirectional protocol -- which conflicts with Nginx's normal request/response cycle:

location /ws {
    proxy_pass http://backend;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
    proxy_read_timeout 3600s;  # keep WebSocket connections alive for up to 1 hour
}

The proxy_read_timeout increase is critical: WebSocket connections are long-lived, and Nginx's default 60-second proxy read timeout would kill idle WebSocket connections. The longer timeout means the Nginx worker holds this connection open for the duration, which ties up one of the worker_connections slots for the lifetime of the WebSocket session.

Graceful Reload and Zero-Downtime Configuration Changes

One of Nginx's most valuable operational features is graceful configuration reload. When you run nginx -s reload (or send SIGHUP to the master process), the following sequence occurs:

The master process reads and validates the new configuration
If the config is valid, the master spawns new worker processes with the new configuration
The master sends a graceful shutdown signal to the old workers
Old workers stop accepting new connections but continue processing in-flight requests
Once all in-flight requests complete, old workers exit
New workers are now handling all traffic with the new configuration

This process is genuinely zero-downtime: at no point are connections dropped or requests lost. The listening sockets are never closed because they are owned by the master process, which persists across reloads. This is a significant advantage over architectures that require process restart for configuration changes.

However, there is a subtle operational risk: if old workers have long-lived connections (WebSockets, long-polling, streaming responses), they will persist indefinitely until those connections close. In pathological cases, you can accumulate dozens of old worker processes, each holding onto connections and memory. The worker_shutdown_timeout directive (Nginx 1.11.11+) forces old workers to close all connections after a deadline:

worker_shutdown_timeout 120s;  # force old workers to exit after 2 minutes

Performance Tuning

Beyond the architectural decisions covered above, several OS-level and Nginx-level tunings are important for high-traffic deployments:

File Descriptors

Each connection requires a file descriptor. Each proxied connection requires two (client + upstream). Nginx's worker_rlimit_nofile directive sets the file descriptor limit per worker. This should be at least 2x worker_connections:

worker_rlimit_nofile 65535;

Sendfile and TCP Optimizations

sendfile on;           # use kernel sendfile() for static files -- zero-copy
tcp_nopush on;         # coalesce small packets (uses TCP_CORK on Linux)
tcp_nodelay on;        # disable Nagle's algorithm for keepalive connections

sendfile avoids copying file data between kernel and user space -- the kernel sends the file directly from the page cache to the socket buffer. tcp_nopush and tcp_nodelay seem contradictory but actually complement each other: tcp_nopush coalesces data during the response body phase (filling full-size segments), while tcp_nodelay ensures the final partial segment is sent immediately when the response is complete.

Open File Cache

open_file_cache max=10000 inactive=60s;
open_file_cache_valid 30s;
open_file_cache_min_uses 2;

This caches file descriptors, file sizes, modification times, and directory lookup results. For servers that serve many static files, this eliminates repeated stat() and open() system calls.

Nginx in the Network Context

Nginx typically sits at a specific position in the infrastructure stack. Traffic from the internet reaches your network via BGP routing to your IP address ranges. For example, Cloudflare (whose network you can explore at AS13335) operates thousands of Nginx instances at their edge. Google (AS15169) uses a custom reverse proxy, but many of their cloud customers run Nginx as their frontend. AWS's Application Load Balancer operates at the same L7 layer as Nginx, and many teams place Nginx behind an ALB for additional routing flexibility.

In a typical production setup, the request path is: client -> CDN edge (which may itself run Nginx) -> your network's edge router -> Nginx (TLS termination + reverse proxy) -> application server. In containerized environments, Nginx often runs as an ingress controller (the Kubernetes Nginx Ingress Controller is essentially Nginx with a dynamic configuration layer), accepting traffic from the cluster's load balancer and routing it to backend pods.

Understanding how Nginx handles connections end-to-end -- from the initial TCP handshake, through TLS termination, HTTP/2 demultiplexing, upstream load balancing, response caching, and client delivery -- is essential for diagnosing the performance problems that inevitably arise in production. Most Nginx "performance issues" are not Nginx problems at all: they are misconfigured upstream keepalives, missing buffering directives, or sub-optimal TLS settings that compound under load. The architecture itself is remarkably efficient; the challenge is configuring it correctly for your specific workload.

Nginx vs. Other Reverse Proxies

Nginx is not the only option. Understanding where it fits relative to alternatives helps inform architectural decisions:

HAProxy -- Stronger L4 load balancing, more sophisticated health checking (active health checks in the open-source version), better connection draining, and a runtime API for dynamic configuration. HAProxy lacks Nginx's static file serving, caching, and general-purpose web server capabilities. Many production architectures use both: HAProxy as the L4/TCP front-end and Nginx as the L7/HTTP reverse proxy behind it.
Envoy -- Built for service mesh and microservice architectures. Envoy has superior observability (distributed tracing, detailed metrics), dynamic configuration via xDS APIs (no reload needed), and native support for gRPC and HTTP/2 to upstreams. It is heavier than Nginx and not typically used as a general-purpose web server.
Caddy -- Automatic HTTPS via ACME/Let's Encrypt, simpler configuration. Good for smaller deployments where operational simplicity is more important than fine-grained control. Not as battle-tested under extreme load.

For most production web applications, Nginx remains the default choice: it is fast, well-understood, extensively documented, and has a two-decade track record of reliability under the highest traffic loads on the internet. Its event-driven architecture solved the C10K problem and, with modern hardware, comfortably handles the C1M problem (one million concurrent connections) on a single machine.