gRPC-Gateway and HTTP/JSON Transcoding

gRPC is built on HTTP/2 and Protocol Buffers, delivering compact binary serialization and bidirectional streaming between services. But the moment you need to expose that same API to a browser, a mobile app using plain HTTP, a partner integration expecting JSON, or a legacy system that only speaks REST, you hit a wall. Browsers cannot make native gRPC calls. Curl cannot send Protobuf. Your monitoring dashboard expects JSON. The solution is HTTP/JSON transcoding — a translation layer that sits in front of your gRPC services and maps RESTful HTTP requests to gRPC calls and back again, automatically and with minimal overhead.

Why You Need HTTP/JSON Alongside gRPC

gRPC's binary protocol is excellent for service-to-service communication inside a data center, but it creates friction at every boundary where the consumer cannot speak gRPC natively. There are several common scenarios where transcoding becomes necessary.

Browser Clients

The browser's fetch() API and XMLHttpRequest work over HTTP/1.1 and HTTP/2, but they cannot construct raw HTTP/2 frames with gRPC's framing protocol. gRPC-Web partially addresses this by defining a browser-compatible subset of gRPC, but it still requires a proxy and does not support all streaming modes. For many teams, a clean JSON/REST API that browsers can call directly is simpler than deploying a gRPC-Web proxy and generating client stubs.

Public APIs

When you publish an API for external developers, you cannot mandate that they install a Protobuf compiler, generate language-specific stubs, and learn the gRPC toolchain. REST/JSON is the lingua franca of public APIs — every programming language, every HTTP library, and every developer already knows how to call it. Google, for instance, exposes nearly all of its Cloud APIs through both gRPC and REST, using the same Protobuf service definitions to generate both interfaces.

Legacy Systems and Third-Party Integrations

Enterprise environments are full of systems that speak HTTP/JSON: webhook consumers, API gateways, monitoring tools, log aggregators, CI/CD pipelines, and SaaS integrations. Rewriting every integration point to use gRPC is rarely feasible. Transcoding lets you keep your internal services on gRPC while presenting a JSON interface to everything that needs one.

Tooling and Debugging

Developers can test a REST endpoint with curl, Postman, or a browser. Testing a gRPC endpoint requires grpcurl, Evans, or a custom client with compiled stubs. During development, incident response, and debugging, the ability to quickly fire off an HTTP request and read a JSON response is invaluable.

google.api.http Annotations in Proto Files

The foundation of gRPC transcoding is a set of annotations defined in Google's API design guide. You add google.api.http options to your Protobuf service methods to specify how each RPC maps to an HTTP endpoint. These annotations live in your .proto files, keeping the REST mapping co-located with the service definition.

Here is a concrete example. Suppose you have a service that manages network prefixes:

syntax = "proto3";

import "google/api/annotations.proto";

package network.v1;

service PrefixService {
  // Look up a single prefix by its CIDR block
  rpc GetPrefix(GetPrefixRequest) returns (Prefix) {
    option (google.api.http) = {
      get: "/v1/prefixes/{cidr}"
    };
  }

  // List all prefixes for an autonomous system
  rpc ListPrefixes(ListPrefixesRequest) returns (ListPrefixesResponse) {
    option (google.api.http) = {
      get: "/v1/asns/{asn}/prefixes"
    };
  }

  // Announce a new prefix
  rpc AnnouncePrefix(AnnouncePrefixRequest) returns (Prefix) {
    option (google.api.http) = {
      post: "/v1/prefixes"
      body: "*"
    };
  }

  // Update prefix attributes
  rpc UpdatePrefix(UpdatePrefixRequest) returns (Prefix) {
    option (google.api.http) = {
      patch: "/v1/prefixes/{prefix.cidr}"
      body: "prefix"
    };
  }
}

Each annotation specifies the HTTP method (get, post, put, patch, delete), a URL pattern with path parameters in curly braces, and optionally a body field that controls how the request body maps to the Protobuf message. The annotations are defined in google/api/annotations.proto and google/api/http.proto, which you include from the googleapis repository.

Path Parameters, Query Parameters, and Body Mapping

The transcoding layer applies precise rules to map between HTTP requests and Protobuf messages. Understanding these rules is essential for designing clean APIs.

Path Parameters

Curly-brace segments in the URL pattern bind to fields in the request message. If your URL is /v1/asns/{asn}/prefixes/{prefix_id} and your request message has fields asn (string) and prefix_id (string), an HTTP request to /v1/asns/AS15169/prefixes/8.8.8.0-24 will populate both fields automatically. Path parameters can reference nested fields using dot notation: {prefix.cidr} maps to the cidr field inside a nested prefix sub-message.

Query Parameters

Any request message fields that are not bound by a path parameter and not consumed by the body mapping are automatically available as query parameters. For a ListPrefixes RPC with fields asn, page_size, and page_token, and a URL pattern of /v1/asns/{asn}/prefixes, the page_size and page_token fields become query parameters:

GET /v1/asns/AS13335/prefixes?page_size=50&page_token=abc123

This is automatic — you do not need to annotate each query parameter individually. Repeated fields (Protobuf arrays) map to repeated query parameters: ?status=active&status=pending.

Body Mapping

The body field in the annotation controls what portion of the request message is populated from the HTTP request body:

body: "*" — The entire request message (minus path-bound fields) is mapped from the JSON body. This is the most common choice for POST and PUT methods.
body: "resource" — Only the resource sub-field of the request message is populated from the body. Other fields come from path or query parameters. This is useful for update operations where the URL identifies the resource and the body carries the new values.
No body field — No request body is expected. All fields must come from path and query parameters. This is the default for GET and DELETE.

On the response side, the entire response message is serialized to JSON by default. You can use response_body to select a sub-field, though this is uncommon.

The grpc-gateway: Go Reverse Proxy Generator

The grpc-gateway is the most widely used open-source tool for gRPC transcoding. It is a protoc plugin that reads your .proto files with google.api.http annotations and generates a Go reverse proxy server. This proxy accepts RESTful HTTP/JSON requests, translates them into gRPC calls, forwards them to your backend service, and translates the gRPC responses back to JSON.

How It Works

You run protoc with the --grpc-gateway_out plugin, and it generates a .gw.go file for each service. This file contains an HTTP handler that you register with a standard Go HTTP router (typically net/http or a mux like gorilla/mux). At runtime, the generated code:

Parses the incoming HTTP request (method, path, query string, body)
Extracts path parameters and query parameters according to the annotations
Deserializes the JSON body into the corresponding Protobuf request message
Makes a gRPC call to the backend service
Serializes the Protobuf response message to JSON
Returns the JSON response with appropriate HTTP status codes

The generated proxy handles error mapping automatically: gRPC status codes are translated to HTTP status codes (NOT_FOUND becomes 404, INVALID_ARGUMENT becomes 400, INTERNAL becomes 500, and so on).

Deployment Topology

In production, the grpc-gateway proxy typically runs as a sidecar or a separate service in front of your gRPC servers. Internal service-to-service communication goes directly via gRPC, while external traffic flows through the gateway:

The grpc-gateway can also run in-process alongside your gRPC server. Instead of making network calls, the generated proxy invokes the gRPC service handler directly through Go interfaces, eliminating network round-trip overhead entirely. This is useful for small deployments where running a separate proxy process is unnecessary.

Envoy gRPC-JSON Transcoding Filter

If you are already running Envoy as your service mesh proxy or API gateway, you can use its built-in gRPC-JSON transcoding filter instead of deploying a separate grpc-gateway process. Envoy's transcoding filter reads the same google.api.http annotations from a compiled Protobuf descriptor file and performs the translation at the proxy layer.

The configuration involves two steps. First, you compile your proto files into a binary descriptor set:

protoc --descriptor_set_out=api_descriptor.pb \
  --include_imports \
  my_service.proto

Then you configure the Envoy filter to use this descriptor:

http_filters:
  - name: envoy.filters.http.grpc_json_transcoder
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters
              .http.grpc_json_transcoder.v3
              .GrpcJsonTranscoder
      proto_descriptor: "/etc/envoy/api_descriptor.pb"
      services:
        - network.v1.PrefixService
      print_options:
        add_whitespace: true
        always_print_primitive_fields: true
        preserve_proto_field_names: true

Envoy handles the transcoding at the network layer with minimal overhead. Because Envoy is written in C++ and already sits in the data path for many microservice architectures (as the sidecar proxy in Istio, for example), adding transcoding requires no additional network hop. The filter operates within the same connection pipeline that handles TLS termination, load balancing, retries, and observability.

The tradeoff is flexibility. Envoy's transcoding filter does not support some edge cases that the grpc-gateway handles, such as custom error response shaping or middleware hooks. But for straightforward REST-to-gRPC mapping, Envoy's approach is operationally simpler — you do not need to build, deploy, or maintain a separate proxy application.

Cloud Endpoints and API Gateway Transcoding

Managed cloud services offer transcoding as a turnkey feature, removing the operational burden entirely.

Google Cloud Endpoints

Google Cloud Endpoints is built on the Extensible Service Proxy (ESP), which is itself built on Envoy. You upload your compiled proto descriptor and a service configuration file that references your google.api.http annotations. The ESP proxy is deployed alongside your gRPC service (as a sidecar container in Cloud Run or GKE), and it automatically transcodes HTTP/JSON to gRPC. Cloud Endpoints also adds API key validation, authentication, rate limiting, and monitoring — all configured declaratively.

AWS API Gateway

AWS API Gateway does not natively understand Protobuf or google.api.http annotations. To transcode with AWS, you typically deploy a grpc-gateway or a Lambda function that performs the translation, then place API Gateway in front for routing, authentication, and throttling. AWS's approach requires more custom glue code, but it integrates with the broader AWS ecosystem (IAM, CloudWatch, WAF).

Other Managed Gateways

Kong and Apigee both support gRPC proxying, with varying levels of transcoding capability. Kong's grpc-gateway plugin maps HTTP routes to gRPC methods using a proto descriptor, similar to Envoy. Apigee supports gRPC backends through its API proxy configuration, letting you define REST endpoints that forward to gRPC services.

Request and Response Transcoding in Detail

The transcoding process involves precise transformations at multiple layers. Understanding these details helps you design proto definitions that produce clean REST APIs.

JSON Field Name Mapping

Protobuf field names use snake_case by convention, but JSON APIs typically use camelCase. The transcoding layer handles this automatically: a Protobuf field named origin_as appears as originAs in JSON by default. You can control this with the json_name option in your proto definition, or by setting preserve_proto_field_names in the transcoder configuration to keep snake_case in JSON.

Protobuf-JSON Type Mapping

Several Protobuf types have special JSON representations that the transcoder handles:

int64, uint64, fixed64 — Serialized as JSON strings to avoid precision loss in JavaScript, which only supports 53-bit integers natively.
bytes — Base64-encoded strings in JSON.
google.protobuf.Timestamp — RFC 3339 date-time strings (e.g., "2025-01-15T08:30:00Z").
google.protobuf.Duration — Seconds with nanosecond precision as a string (e.g., "3.5s").
google.protobuf.Struct — Arbitrary JSON objects, useful for dynamic or untyped data.
google.protobuf.FieldMask — Comma-separated field paths (e.g., "origin_as,rpki_status"). Be aware that the JSON representation uses camelCase field names ("originAs") while the Protobuf representation uses snake_case ("origin_as"), and the transcoder must map between them. Mismatches here are a common source of silent partial-update bugs where fields are ignored because the name did not convert correctly.
Enums — Serialized as strings by default ("VALID" instead of 1), though numeric representation is also accepted.

HTTP Status Code Mapping

gRPC uses a fixed set of status codes that differ from HTTP status codes. The transcoder maps between them:

gRPC Status         HTTP Status   Meaning
------------------------------------------------------
OK                  200           Success
INVALID_ARGUMENT    400           Bad request
UNAUTHENTICATED     401           Missing/invalid auth
PERMISSION_DENIED   403           Forbidden
NOT_FOUND           404           Resource not found
ALREADY_EXISTS      409           Conflict
FAILED_PRECONDITION 400           Precondition failed
RESOURCE_EXHAUSTED  429           Rate limited
CANCELLED           499           Client cancelled
INTERNAL            500           Internal server error
UNIMPLEMENTED       501           Not implemented
UNAVAILABLE         503           Service unavailable
DEADLINE_EXCEEDED   504           Gateway timeout

The transcoder also maps gRPC error detail messages to a JSON error response body, typically in the format {"code": 5, "message": "prefix not found", "details": [...]}. This follows Google's standard error model, but you can customize the error shape in the grpc-gateway through interceptors.

Additional Bindings

A single RPC method can be exposed at multiple HTTP endpoints using the additional_bindings option. This is useful for API versioning or when you want multiple URL patterns to invoke the same backend method:

rpc GetPrefix(GetPrefixRequest) returns (Prefix) {
  option (google.api.http) = {
    get: "/v1/prefixes/{cidr}"
    additional_bindings {
      get: "/v1/routes/{cidr}/prefix"
    }
  };
}

Streaming Transcoding Limitations

gRPC supports four communication patterns: unary, server streaming, client streaming, and bidirectional streaming. Transcoding works well for some of these, but not all.

Unary RPCs

A single request and single response — this maps cleanly to standard HTTP request/response semantics. Every transcoding solution handles unary RPCs without issues.

Server Streaming

The server sends multiple response messages for a single request. The grpc-gateway represents this as newline-delimited JSON (NDJSON), where each line is a complete JSON object representing one response message. Envoy uses HTTP chunked transfer encoding to stream responses. This works for use cases like watching BGP route updates in real time or tailing a log stream, but clients must be written to handle streaming responses rather than expecting a single JSON array.

// Server streaming over HTTP:
GET /v1/routes/watch?prefix=8.8.8.0/24

// Response (NDJSON - one JSON object per line):
{"prefix":"8.8.8.0/24","origin_as":"AS15169","event":"UPDATE"}
{"prefix":"8.8.8.0/24","origin_as":"AS15169","event":"WITHDRAW"}
{"prefix":"8.8.8.0/24","origin_as":"AS15169","event":"UPDATE"}

Client Streaming

The client sends multiple request messages and receives a single response. This has no natural HTTP/1.1 equivalent. You cannot send multiple JSON bodies in a single HTTP request without a framing protocol. The grpc-gateway does not support client streaming transcoding. You must either batch the messages into a single request (by redesigning the API with a repeated field) or require clients to use native gRPC for this method.

Bidirectional Streaming

Both client and server stream messages simultaneously. This is fundamentally incompatible with HTTP/1.1 request-response semantics. Neither the grpc-gateway nor Envoy's transcoding filter supports bidirectional streaming over REST. WebSockets could bridge the gap in theory, but none of the standard transcoding tools implement this. For bidirectional streaming, clients must use native gRPC or gRPC-Web.

A subtler issue affects even server-streaming transcoding: gRPC uses HTTP/2 trailers to convey final status codes and error details, but HTTP/1.1 clients (and many HTTP libraries) do not reliably support trailers. If the transcoding layer downgrades the response to HTTP/1.1, the grpc-status trailer may be silently dropped, making it impossible for the client to distinguish a successful stream that ended normally from one that terminated with an error. This limitation is fundamental to the REST model, not a deficiency in the tools. HTTP request-response semantics are inherently half-duplex at the application layer. If your service relies heavily on client or bidirectional streaming, transcoding is not the right approach for those specific methods — expose them through native gRPC or gRPC-Web instead, while transcoding the unary and server-streaming methods for REST consumers.

OpenAPI/Swagger Generation from Proto Files

One of the most valuable side effects of annotating your proto files with google.api.http is the ability to automatically generate an OpenAPI (Swagger) specification for your REST API. The grpc-gateway project includes a companion protoc plugin, protoc-gen-openapiv2, that reads your annotated protos and produces an OpenAPI 2.0 (Swagger) JSON file.

protoc --openapiv2_out=./docs \
  --openapiv2_opt=logtostderr=true \
  --openapiv2_opt=json_names_for_fields=true \
  my_service.proto

The generated specification includes all the HTTP endpoints, request/response schemas derived from Protobuf message definitions, path and query parameters, and even comments from your proto files as descriptions. You can feed this directly into Swagger UI, Redoc, or any other API documentation tool to produce interactive documentation.

This means your Protobuf service definitions become the single source of truth for:

gRPC client stubs (generated by protoc)
REST reverse proxy code (generated by protoc-gen-grpc-gateway)
OpenAPI documentation (generated by protoc-gen-openapiv2)
Client SDKs for REST consumers (generated from OpenAPI by tools like openapi-generator)

The Buf ecosystem offers a more modern alternative. The buf generate command can run all three plugins in a single invocation, managed by a buf.gen.yaml configuration file. Buf also handles dependency management for the googleapis protos, eliminating the manual setup of importing google/api/annotations.proto.

OpenAPI v3 Support

The grpc-gateway's OpenAPI plugin currently generates OpenAPI 2.0 (Swagger). For OpenAPI 3.0, you can use the gnostic converter to transform the 2.0 spec to 3.0, or use third-party protoc plugins like protoc-gen-openapi (from the Google gnostic project) that produce OpenAPI 3.0 directly. Google's own API tooling (used for Cloud APIs) generates OpenAPI 3.0 from annotated proto files, confirming that the annotation format supports both specification versions.

Performance Overhead of Transcoding

Transcoding introduces overhead at several points. Understanding where the cost lies helps you decide whether it matters for your use case.

Serialization Cost

The dominant overhead is the JSON-to-Protobuf and Protobuf-to-JSON conversion. Protobuf binary encoding is compact and fast to parse. JSON is text-based, verbose, and requires more CPU to parse and generate. For small messages (under a few kilobytes), the difference is negligible — microseconds at most. For large messages (megabytes of data, deeply nested structures, large repeated fields), JSON serialization can consume meaningful CPU time.

Benchmarks consistently show that Protobuf serialization is 3-10x faster than JSON, and the encoded size is 2-5x smaller. But in a transcoding scenario, you pay the JSON cost only on the external-facing edge. Internal service-to-service calls still use native Protobuf, so the performance impact is limited to the boundary.

Network Overhead

If the transcoding proxy runs as a separate process, there is an additional network hop between the proxy and the gRPC backend. This adds latency (typically sub-millisecond on a local network, a few milliseconds cross-zone in a cloud environment) and consumes bandwidth for the internal gRPC call. Running the proxy in-process (grpc-gateway's in-process mode) or as an Envoy sidecar (on the same pod/VM) minimizes this overhead.

Memory and CPU

The transcoding proxy must buffer the full JSON request body before it can deserialize it into a Protobuf message (JSON is not streamable in the same way Protobuf is, because field ordering and nesting require seeing the complete object). For very large request bodies, this means the proxy's memory usage scales with request size. gRPC's native streaming avoids this issue by processing messages incrementally.

Practical Impact

For the vast majority of APIs, the transcoding overhead is not the bottleneck. Database queries, network calls to downstream services, and business logic dominate latency. The transcoding layer typically adds 0.5-2ms of latency for unary calls with reasonably sized payloads. At Google scale, where Cloud APIs serve millions of requests per second through transcoding, the architecture has proven viable for production workloads.

If you are optimizing for absolute minimum latency on the external API, consider:

Running the transcoding proxy in-process or as a sidecar to eliminate the network hop
Using Envoy's C++ transcoding filter rather than the Go-based grpc-gateway for lower per-request CPU cost
Enabling HTTP/2 between the client and the transcoding proxy to avoid head-of-line blocking
Caching frequently requested resources at the CDN or proxy layer to bypass transcoding entirely

Designing Proto Files for Good REST APIs

Not every Protobuf service definition produces a clean REST API through transcoding. Following a few design principles ensures the generated REST endpoints feel natural to consumers who have never seen your proto files.

Use Resource-Oriented Design

Structure your services around resources (nouns) rather than actions (verbs). Instead of rpc ActivatePrefix(ActivatePrefixRequest) mapped to POST /v1/activatePrefix, model it as rpc UpdatePrefix(UpdatePrefixRequest) mapped to PATCH /v1/prefixes/{id} with an active field. This follows the standard REST conventions that API consumers expect.

Follow the AIP (API Improvement Proposals)

Google's AIP guidelines (aip.dev) codify best practices for designing APIs that work well with both gRPC and REST. Key AIPs include standard methods (AIP-131 through AIP-135 for Get, List, Create, Update, Delete), field masks for partial updates (AIP-134), filtering and pagination (AIP-132 and AIP-158), and error handling (AIP-193). Following these patterns produces consistent, predictable REST endpoints.

Keep Messages Flat for Simple Endpoints

Deeply nested Protobuf messages create complex JSON structures. Where possible, flatten request messages so that the HTTP mapping is straightforward. A request with top-level fields maps cleanly to path parameters, query parameters, and a simple JSON body. A request that requires navigating three levels of nesting to set a field creates friction for REST consumers.

End-to-End Workflow

Putting it all together, here is the typical workflow for adding gRPC transcoding to a service:

Define your service in .proto files with google.api.http annotations on each RPC method.
Generate gRPC server stubs in your language of choice using protoc.
Implement the gRPC service with your business logic.
Generate the transcoding layer — either grpc-gateway Go code, an Envoy descriptor, or a cloud service configuration.
Generate OpenAPI documentation using protoc-gen-openapiv2.
Deploy the gRPC service and the transcoding proxy together.
Test the REST API using curl or Postman, and the gRPC API using grpcurl or generated client stubs.

The key insight is that you write your service logic once, in your gRPC implementation. The REST API is entirely derived — no separate REST controllers, no manual JSON parsing, no duplicate validation logic. When you add a field to your Protobuf message, it automatically appears in both the gRPC and REST interfaces. When you add a new RPC method with HTTP annotations, a new REST endpoint appears.

When Not to Transcode

Transcoding is not always the right choice. If your API is exclusively consumed by services you control and all of them can speak gRPC, transcoding adds complexity without benefit. If your API relies heavily on client streaming or bidirectional streaming, the transcoded REST surface will be incomplete, creating a confusing experience where some methods work over REST and others do not. If you need to support REST-specific patterns that do not map to gRPC semantics — like content negotiation, HATEOAS links, or multipart file uploads — a custom REST layer may be more appropriate than transcoding.

For everything else — public APIs that need broad accessibility, internal services with mixed client ecosystems, gradual migrations from REST to gRPC, or teams that want the type safety of Protobuf with the convenience of JSON — transcoding is one of the most effective patterns in the gRPC ecosystem.