How GraphQL Works: Schema, Queries, Resolvers, and Execution

GraphQL is a query language for APIs and a runtime for executing those queries against a type system you define for your data. Unlike REST, where the server dictates fixed response shapes for each endpoint, GraphQL lets the client specify exactly what data it needs in a single request. Developed internally at Facebook in 2012 and open-sourced in 2015, GraphQL addresses the over-fetching, under-fetching, and endpoint proliferation problems that plague REST APIs at scale. The specification (currently the October 2021 edition) defines a type system, query language, execution semantics, and validation rules — but deliberately leaves transport, serialization format, and caching strategy to the implementation.

The Type System: Schema Definition Language

Every GraphQL API is defined by a schema written in the Schema Definition Language (SDL). The schema is the contract between client and server: it enumerates every type, field, argument, and relationship available for querying. This is not documentation that can drift from reality — it is the source of truth that the runtime enforces at execution time.

GraphQL has five built-in scalar types: Int, Float, String, Boolean, and ID. Custom scalars (like DateTime, URL, or JSON) can be defined for domain-specific data. Beyond scalars, the type system supports object types, interfaces, unions, enums, and input types.

type Query {
  autonomousSystem(asn: Int!): AutonomousSystem
  prefix(cidr: String!): Prefix
  search(query: String!, first: Int = 10): SearchConnection!
}

type AutonomousSystem {
  asn: Int!
  name: String
  country: String
  prefixes(first: Int, after: String): PrefixConnection!
  peers: [AutonomousSystem!]!
  upstreams: [AutonomousSystem!]!
}

type Prefix {
  cidr: String!
  origin: AutonomousSystem
  asPath: [Int!]!
  firstSeen: DateTime
  lastUpdated: DateTime
  rpkiStatus: RPKIStatus!
}

enum RPKIStatus {
  VALID
  INVALID
  NOT_FOUND
}

type PrefixConnection {
  edges: [PrefixEdge!]!
  pageInfo: PageInfo!
  totalCount: Int!
}

type PrefixEdge {
  node: Prefix!
  cursor: String!
}

type PageInfo {
  hasNextPage: Boolean!
  endCursor: String
}

The exclamation mark (!) denotes non-nullable fields. [Int!]! means a non-nullable list of non-nullable integers — the list itself cannot be null, and no element in the list can be null. This three-level nullability (nullable list, nullable element, or both) gives schema designers precise control over the guarantees they offer to clients.

The Connection pattern shown above (edges, nodes, cursors, pageInfo) is the Relay cursor-based pagination specification. It has become the de facto standard for pagination in GraphQL APIs, even outside the Relay ecosystem, because it handles cursor-based pagination correctly across insertions and deletions — something offset-based pagination cannot do.

Operations: Queries, Mutations, and Subscriptions

GraphQL defines three root operation types, each serving a distinct purpose:

Queries are read operations. They are expected to be side-effect-free and idempotent. The client specifies the exact shape of the response by selecting fields from the schema:

query GetAS {
  autonomousSystem(asn: 13335) {
    asn
    name
    country
    prefixes(first: 5) {
      edges {
        node {
          cidr
          rpkiStatus
        }
      }
      totalCount
    }
  }
}

This query returns exactly the fields requested — no more, no less. The response shape mirrors the query shape:

{
  "data": {
    "autonomousSystem": {
      "asn": 13335,
      "name": "CLOUDFLARENET",
      "country": "US",
      "prefixes": {
        "edges": [
          { "node": { "cidr": "1.1.1.0/24", "rpkiStatus": "VALID" } },
          { "node": { "cidr": "1.0.0.0/24", "rpkiStatus": "VALID" } }
        ],
        "totalCount": 1842
      }
    }
  }
}

Mutations are write operations. They modify server-side state and return the resulting data. Unlike queries, mutations are executed serially (not in parallel) to ensure predictable ordering of side effects:

mutation AddAlert {
  createPrefixAlert(input: {
    prefix: "1.1.1.0/24"
    events: [HIJACK, WITHDRAWAL, ORIGIN_CHANGE]
  }) {
    alert {
      id
      prefix
      events
      createdAt
    }
    errors {
      field
      message
    }
  }
}

Subscriptions provide real-time updates via a persistent connection. When a client subscribes, the server pushes updates whenever the underlying data changes. This is typically implemented over WebSockets using the graphql-ws protocol (or the older subscriptions-transport-ws):

subscription WatchPrefix {
  prefixUpdate(cidr: "1.1.1.0/24") {
    cidr
    origin {
      asn
      name
    }
    asPath
    updateType
    timestamp
  }
}

Resolver Execution Model

The execution engine is the heart of a GraphQL server. When a query arrives, it passes through three phases: parsing (query string to AST), validation (AST against schema), and execution (AST traversal with resolver invocation).

Each field in the schema has an associated resolver function. The resolver receives four arguments:

parent (or root) — the result returned by the parent field's resolver
args — the arguments passed to this field in the query
context — a shared object containing per-request state (authentication, database connections, DataLoaders)
info — metadata about the query execution (field name, selected sub-fields, schema)

Execution proceeds top-down through the query. The root Query.autonomousSystem resolver fires first, receives args.asn = 13335, fetches the AS from the database, and returns the result object. The engine then resolves each selected child field (name, country, prefixes) using the returned object as the parent argument. For scalar fields, the default resolver simply reads the property from the parent object. For complex fields like prefixes, a custom resolver executes a database query or API call.

This recursive descent execution model is elegant but introduces a critical performance hazard: the N+1 problem.

The N+1 Problem and DataLoader

Consider a query that fetches 50 prefixes and, for each, resolves the origin AS. The prefix list resolver fires once (1 query), then the origin resolver fires 50 times — once per prefix. If 30 of those prefixes share the same origin AS, the naive implementation executes 30 redundant database queries for the same AS record. This is the N+1 problem: 1 query for the list + N queries for each item's related data.

The standard solution is DataLoader, a utility that batches and deduplicates data-fetching requests within a single execution tick. Instead of fetching one AS per resolver call, DataLoader collects all requested AS numbers during the current execution tick, then fires a single batched query:

// Without DataLoader: N+1 queries
// Query 1: SELECT * FROM prefixes LIMIT 50
// Query 2: SELECT * FROM autonomous_systems WHERE asn = 13335
// Query 3: SELECT * FROM autonomous_systems WHERE asn = 13335  (duplicate!)
// Query 4: SELECT * FROM autonomous_systems WHERE asn = 15169
// ... 47 more queries

// With DataLoader: 2 queries total
// Query 1: SELECT * FROM prefixes LIMIT 50
// Query 2: SELECT * FROM autonomous_systems WHERE asn IN (13335, 15169, 32934, ...)

DataLoader works by deferring execution to the end of the current event loop tick (in Node.js) or using an equivalent batching mechanism in other runtimes. Each resolver calls loader.load(key), which returns a promise but does not execute immediately. After all resolvers in the current level have registered their keys, DataLoader calls the batch function once with all collected keys, then distributes results back to the individual promises.

The deduplication aspect is equally important: if 30 prefixes share origin AS 13335, DataLoader ensures the batch function receives AS 13335 only once. The caching is per-request — a new DataLoader instance is created for each GraphQL request to prevent stale data and cross-request leakage.

Introspection and Tooling

One of GraphQL's most powerful features is introspection: the ability to query the schema itself. Every GraphQL server must support a special __schema query that exposes the full type system:

{
  __schema {
    types {
      name
      kind
      fields {
        name
        type { name kind ofType { name } }
      }
    }
    queryType { name }
    mutationType { name }
  }
}

This is what powers tools like GraphiQL, GraphQL Playground, and Apollo Studio — they query the schema at runtime to provide autocompletion, documentation, and query validation in the browser. There is no separate API documentation step; the schema is the documentation. This is a significant advantage over REST, where OpenAPI/Swagger specifications must be written and maintained separately from the implementation.

Production APIs often disable introspection to reduce their attack surface, since exposing the full schema reveals every type, field, and relationship available. This is a defense-in-depth measure, not a security boundary — a determined attacker can probe fields by name without introspection.

Persisted Queries and Security

Accepting arbitrary query strings from clients opens GraphQL servers to several attack vectors: deeply nested queries that cause exponential resolver execution, overly broad queries that fetch entire datasets, and query injection attacks. Persisted queries address these concerns by replacing arbitrary query strings with pre-registered query identifiers.

There are two approaches to persisted queries:

Automatic Persisted Queries (APQ) use content-addressable hashing. The client sends a SHA-256 hash of the query instead of the full query text. On cache miss, the server asks the client to send the full query, stores it keyed by hash, and subsequent requests use only the hash. This saves bandwidth but does not restrict which queries can be executed.

Registered Persisted Queries go further: only queries registered at build time are allowed. The server maintains a whitelist of query hashes extracted from the client codebase during the build process. Any query not in the whitelist is rejected. This eliminates the entire class of malicious query attacks and also improves performance, since the server can pre-compile and optimize registered queries.

# APQ request (hash only, no query text)
POST /graphql
{
  "extensions": {
    "persistedQuery": {
      "version": 1,
      "sha256Hash": "ecf4edb46db40b5132295c0291d62fb65d6759a9eedfa4d5d612dd5ec54a6b38"
    }
  },
  "variables": { "asn": 13335 }
}

Other security measures include query depth limiting (rejecting queries deeper than a threshold), query cost analysis (assigning cost to each field and rejecting queries exceeding a budget), and rate limiting per client or per query complexity.

Schema Stitching and Federation

As GraphQL APIs grow, a single monolithic schema becomes a bottleneck — both organizationally (many teams editing one schema) and operationally (one service responsible for all data). Two approaches address this: schema stitching and federation.

Schema stitching was the original approach. A gateway service merges multiple GraphQL schemas into one, delegating fields to the appropriate upstream service. The gateway resolves cross-service references by making additional requests to upstream services. This works but is fragile: the gateway must know the relationships between schemas, and changes to upstream schemas require gateway updates.

Apollo Federation (and its open specification) took a different approach: each service defines its own schema and declares which types it can extend or reference. The gateway uses metadata directives to compose a unified schema automatically:

# Routing service schema
type AutonomousSystem @key(fields: "asn") {
  asn: Int!
  name: String
  country: String
  prefixes: [Prefix!]!
}

# Analytics service schema (extends the type)
extend type AutonomousSystem @key(fields: "asn") {
  asn: Int! @external
  trafficVolume: Float
  peeringScore: Int
  historicalUptime: [UptimeRecord!]!
}

The @key directive identifies the fields that uniquely identify an entity across services. The @external directive marks fields that are defined in another service but referenced locally. The gateway composes these into a single schema where a query for AutonomousSystem transparently fetches name and prefixes from the routing service and trafficVolume from the analytics service.

Federation introduces a query planner in the gateway that determines which services to call and in what order. For a query that spans three services, the planner builds an execution plan that parallelizes independent fetches and sequences dependent ones. This planning step is non-trivial: for complex queries with nested cross-service references, the planner must solve a dependency graph to minimize round trips between services.

Federation v2 (released 2022) introduced significant improvements: the @shareable directive for types contributed by multiple services, @override for gradual field migration between services, and @inaccessible for hiding implementation-detail fields from the public schema.

Transport: GraphQL Is Not HTTP-Specific

GraphQL is transport-agnostic, but in practice, most implementations use HTTP POST with a JSON body containing the query string, optional variables, and optional operation name. The GraphQL-over-HTTP specification (maintained by the GraphQL Foundation) standardizes this:

POST /graphql HTTP/1.1
Content-Type: application/json

{
  "query": "query GetAS($asn: Int!) { autonomousSystem(asn: $asn) { name } }",
  "variables": { "asn": 13335 },
  "operationName": "GetAS"
}

Some implementations also support HTTP GET for queries (not mutations), encoding the query as a URL parameter. This enables HTTP caching via CDN edge servers and browser caches, since GET requests with identical URLs return identical responses. This is critical for high-traffic APIs where the same queries are repeated millions of times.

Subscriptions typically use WebSockets with the graphql-ws protocol. The client sends a connection initialization message with optional authentication, then subscribe/unsubscribe messages. The server pushes data messages when events occur. Some implementations use Server-Sent Events (SSE) for subscriptions when the client only needs server-to-client streaming without bidirectional communication.

For high-performance inter-service communication, GraphQL can run over gRPC or other HTTP/2-based transports, though this is uncommon outside of federation gateway-to-subgraph communication.

GraphQL vs REST: Trade-offs

The GraphQL vs REST debate is often framed as a replacement story, but the reality is a set of trade-offs:

Dimension	REST	GraphQL
Data fetching	Fixed response shapes, multiple endpoints	Client-specified shapes, single endpoint
Over-fetching	Common (endpoint returns all fields)	Eliminated (client selects fields)
Under-fetching	Common (requires multiple requests)	Eliminated (nested queries in one request)
Caching	HTTP caching works natively (GET, ETag, Cache-Control)	Requires custom caching (single POST endpoint defeats HTTP caching)
File uploads	Native (multipart/form-data)	Awkward (multipart spec exists but is not part of GraphQL spec)
Error handling	HTTP status codes (400, 404, 500)	Always 200 with errors array in response body
Versioning	URL versioning (/v1/, /v2/) or header negotiation	No versioning (additive schema changes, @deprecated directive)
Tooling	Curl, Postman, any HTTP client	Specialized clients (Apollo, Relay, urql)
Learning curve	Low (uses HTTP semantics most developers know)	Medium (new query language, type system, execution model)
Backend complexity	Low (frameworks map routes to handlers)	High (resolver chains, DataLoader, query cost analysis)

GraphQL shines in environments with diverse clients (mobile apps needing minimal data, web apps needing rich data, internal tools needing different views of the same data) and complex object graphs with many relationships. It is less appropriate for simple CRUD APIs, file-serving endpoints, or situations where HTTP caching is critical and the engineering team lacks the capacity to implement a custom caching layer.

The caching disadvantage is significant. REST APIs benefit from decades of HTTP caching infrastructure: browsers, CDN edges, and reverse proxies all cache GET responses automatically based on Cache-Control headers. A GraphQL POST to /graphql bypasses all of this. Solutions exist — APQ with GET, response-level caching, normalized cache invalidation — but they require additional engineering effort.

Schema Evolution and Deprecation

GraphQL's approach to API evolution is fundamentally different from REST versioning. Instead of creating /v2/ endpoints with breaking changes, GraphQL encourages additive changes: new fields, new types, and new arguments are always safe to add because clients only receive fields they request.

When a field must be removed, the @deprecated directive marks it for deprecation with a reason and a migration path:

type AutonomousSystem {
  asn: Int!
  name: String
  asName: String @deprecated(reason: "Use 'name' instead. Will be removed 2027-01-01.")
}

Deprecated fields continue to work but are hidden from introspection by default and flagged in tooling. Usage metrics (available in Apollo Studio and similar platforms) track which clients still use deprecated fields, enabling data-driven removal timelines.

This model works well for public APIs with many independent clients. For internal APIs where you control all clients, coordinated breaking changes may be more pragmatic than maintaining deprecated fields indefinitely.

Performance: Beyond N+1

Beyond the N+1 problem, GraphQL servers face several performance challenges:

Query complexity analysis prevents expensive queries from overwhelming the server. Each field is assigned a cost (typically 1 for scalars, higher for fields that trigger database queries or external API calls, and multiplied by pagination limits for list fields). The server calculates total query cost before execution and rejects queries exceeding the budget:

# Cost calculation example:
# autonomousSystem: cost 1
#   prefixes(first: 100): cost 1 * 100 = 100
#     origin: cost 1 * 100 = 100
#       peers: cost 1 * 100 * 50 = 5000  <-- exceeds budget!
# Total: 5201 (budget: 1000) → REJECTED

Response caching in GraphQL requires more sophistication than HTTP caching. Normalized caching (used by Apollo Client and Relay on the client side) stores each entity separately, keyed by type and ID. When a mutation updates an entity, all queries referencing that entity are automatically updated. Server-side, partial query caching can store resolver results keyed by field path and arguments, avoiding redundant computation for shared sub-queries.

Deferred execution (the @defer and @stream directives, currently in the GraphQL specification proposal stage) allows the server to send critical data immediately and stream less-important data as it becomes available. This is particularly useful for fields backed by slow data sources — the client renders available data immediately instead of waiting for the slowest resolver.

GraphQL and Network Infrastructure

From a networking perspective, GraphQL's single-endpoint model interacts with infrastructure differently than REST. Load balancers and CDN configurations designed for URL-based routing must adapt to content-based routing. A Vary-based caching strategy becomes complex when all requests hit the same URL with different POST bodies.

The BGP looking glass at god.ad uses a REST-style API for its lookup endpoints, which allows Cloudflare CDN to cache responses based on URL. If the API were GraphQL, each query would require server-side processing even for identical queries that had been answered moments before — unless the APQ mechanism or an edge-side GraphQL cache were deployed.

For observability, GraphQL's single endpoint means traditional HTTP access logs (which log URL, method, and status code) provide no visibility into what data clients are accessing. GraphQL servers need operation-name-aware logging, per-resolver tracing (compatible with OpenTelemetry), and field-level usage tracking to achieve the observability that REST gets from URL-based access logs for free.

Summary

GraphQL replaces REST's fixed-endpoint model with a typed query language that gives clients precise control over the data they receive. Its type system provides a machine-readable contract that powers introspection, tooling, and validation. The resolver execution model is intuitive but requires DataLoader to avoid N+1 performance problems. Persisted queries address the security concerns of accepting arbitrary query strings. Federation enables organizational scaling by distributing schema ownership across teams while maintaining a unified API surface.

The trade-offs are real: GraphQL adds complexity to the server, complicates HTTP caching, requires specialized tooling, and demands careful attention to query cost analysis and resolver performance. For APIs serving diverse clients with complex data relationships, these trade-offs pay off. For simple CRUD APIs or APIs where HTTP caching is the primary performance strategy, REST remains the pragmatic choice.

You can explore how GraphQL compares to gRPC and REST for different API patterns, see how gRPC handles service-to-service communication, or examine the HTTP/2 protocol that underlies many GraphQL transport implementations. To investigate real-world API infrastructure and the networks that connect them, try the god.ad BGP Looking Glass.