gRPC Interceptors and Middleware Patterns

gRPC interceptors are the middleware layer of the gRPC framework. They sit between the client or server application code and the underlying gRPC transport, letting you intercept every RPC call to inject cross-cutting concerns like logging, authentication, metrics collection, rate limiting, and error recovery. If you have built gRPC services, you have almost certainly needed interceptors — they are the canonical way to avoid scattering infrastructure logic across every handler.

Unlike HTTP middleware, which operates on generic request/response pairs, gRPC interceptors are typed to the RPC model. They distinguish between unary calls (single request, single response) and streaming calls (one or both sides sending a sequence of messages). This distinction matters because the interception points and the data available at each point differ fundamentally between the two modes.

Unary Interceptors vs Stream Interceptors

gRPC defines two interceptor categories that mirror its two communication patterns. Understanding when each fires and what it can access is essential to writing correct middleware.

Unary Interceptors

A unary interceptor wraps a single request-response exchange. The interceptor receives the full request, can inspect or modify it, invokes the next handler (or the actual RPC implementation), and then receives the full response before it is sent back. This is the simpler model and covers the majority of gRPC use cases.

In Go, a server-side unary interceptor has this signature:

func myInterceptor(
    ctx context.Context,
    req interface{},
    info *grpc.UnaryServerInfo,
    handler grpc.UnaryHandler,
) (interface{}, error) {
    // Pre-processing: inspect ctx, req, info.FullMethod
    start := time.Now()

    // Call the actual handler
    resp, err := handler(ctx, req)

    // Post-processing: log, record metrics, transform errors
    log.Printf("method=%s duration=%v err=%v", info.FullMethod, time.Since(start), err)
    return resp, err
}

The key property of unary interceptors is that both the request and response are fully available as concrete objects. You can deserialize, validate, mutate, or replace them entirely. The info.FullMethod string (e.g., /mypackage.MyService/MyMethod) tells you exactly which RPC is being invoked, enabling method-specific logic.

Stream Interceptors

Stream interceptors handle server-streaming, client-streaming, and bidirectional-streaming RPCs. Instead of receiving a complete request and returning a complete response, a stream interceptor receives a ServerStream (or ClientStream) object that wraps the underlying stream. The interceptor can then wrap this stream object to intercept individual messages as they flow through.

func myStreamInterceptor(
    srv interface{},
    ss grpc.ServerStream,
    info *grpc.StreamServerInfo,
    handler grpc.StreamHandler,
) error {
    // Wrap the stream to intercept messages
    wrapped := &wrappedStream{ServerStream: ss}

    // Call the handler with the wrapped stream
    err := handler(srv, wrapped)
    return err
}

type wrappedStream struct {
    grpc.ServerStream
}

func (w *wrappedStream) RecvMsg(m interface{}) error {
    // Intercept each incoming message
    err := w.ServerStream.RecvMsg(m)
    log.Printf("received message: %v", m)
    return err
}

func (w *wrappedStream) SendMsg(m interface{}) error {
    // Intercept each outgoing message
    log.Printf("sending message: %v", m)
    return w.ServerStream.SendMsg(m)
}

Stream interceptors are more complex because the interception surface is broader. You need to handle SendMsg, RecvMsg, SendHeader, and SetTrailer calls, each of which can carry metadata. A single stream may exchange hundreds of messages, so performance of your interceptor logic matters more here than in the unary case. Note that each RecvMsg/SendMsg wrapper invocation allocates a new wrapper closure in most implementations, which can create significant GC pressure on high-throughput streams -- profile your heap allocations if you observe GC pauses during streaming RPCs.

Server-Side vs Client-Side Interceptors

gRPC interceptors exist on both sides of the connection. They serve different purposes and have access to different information.

Server-Side Interceptors

Server interceptors run on every incoming RPC before (and after) the handler executes. They are the natural place for:

Authentication and authorization -- extracting tokens from metadata, validating them, and attaching user identity to the context
Request validation -- rejecting malformed requests before they reach business logic
Server-side metrics -- recording RPC counts, latencies, and error rates per method
Panic recovery -- catching panics in handlers and converting them to proper gRPC errors instead of crashing the process
Rate limiting -- throttling clients based on identity, method, or global load

Server interceptors are registered when creating the gRPC server. In Go:

server := grpc.NewServer(
    grpc.UnaryInterceptor(myUnaryInterceptor),
    grpc.StreamInterceptor(myStreamInterceptor),
)

Client-Side Interceptors

Client interceptors wrap outgoing RPCs. They are useful for:

Attaching metadata -- injecting auth tokens, trace IDs, or correlation headers on every outgoing call
Client-side retries -- implementing retry logic with backoff for transient gRPC errors
Client-side metrics -- tracking latency and error rates from the caller's perspective
Timeout enforcement -- ensuring every call has a deadline set
Circuit breaking -- stopping calls to unhealthy backends

In Go, client interceptors are specified via DialOption:

conn, err := grpc.Dial(
    target,
    grpc.WithUnaryInterceptor(clientUnaryInterceptor),
    grpc.WithStreamInterceptor(clientStreamInterceptor),
)

The client unary interceptor signature mirrors the server side but includes the grpc.CallOption slice and the invoker function:

func clientUnaryInterceptor(
    ctx context.Context,
    method string,
    req, reply interface{},
    cc *grpc.ClientConn,
    invoker grpc.UnaryInvoker,
    opts ...grpc.CallOption,
) error {
    // Add auth metadata
    ctx = metadata.AppendToOutgoingContext(ctx, "authorization", "Bearer "+token)

    // Invoke the RPC
    err := invoker(ctx, method, req, reply, cc, opts...)
    return err
}

Interceptor Chaining and Execution Order

Real applications need multiple interceptors: logging, auth, metrics, validation, recovery. These must be composed into a chain, and the order matters. A logging interceptor should wrap everything so it captures the total duration, while an auth interceptor should run early so unauthorized requests are rejected before wasting compute on validation or business logic.

Vanilla gRPC (in Go) only allows a single interceptor per type. The grpc-middleware library (now part of grpc-ecosystem) provides ChainUnaryServer and ChainStreamServer to compose multiple interceptors. As of gRPC-Go v1.42+, native chaining is supported via grpc.ChainUnaryInterceptor.

server := grpc.NewServer(
    grpc.ChainUnaryInterceptor(
        recoveryInterceptor,    // outermost: catch panics
        loggingInterceptor,     // log every RPC
        metricsInterceptor,     // record Prometheus metrics
        authInterceptor,        // validate tokens
        validationInterceptor,  // validate request payloads
    ),
)

Interceptors execute in order for the pre-processing phase (before calling the handler) and in reverse order for the post-processing phase (after the handler returns). This creates a nested "onion" structure:

In this model, if the auth interceptor rejects a request, the validation interceptor and the handler never execute. But the logging and metrics interceptors still see the result (the auth error) because they wrap the entire inner chain. This is exactly the behavior you want: metrics capture all RPCs including rejected ones, and logs capture the full picture.

Common Interceptor Patterns

Logging

A logging interceptor records the method name, duration, status code, and optionally the request/response payloads for every RPC. It should be one of the outermost interceptors so it captures the full duration including time spent in other interceptors.

func loggingInterceptor(
    ctx context.Context,
    req interface{},
    info *grpc.UnaryServerInfo,
    handler grpc.UnaryHandler,
) (interface{}, error) {
    start := time.Now()
    resp, err := handler(ctx, req)

    code := status.Code(err)
    log.Printf("grpc method=%s code=%s duration=%v peer=%s",
        info.FullMethod,
        code.String(),
        time.Since(start),
        peer(ctx),
    )
    return resp, err
}

For production systems, structured logging (JSON) with fields for trace ID, user identity, and request size is more useful than plain text. The grpc-ecosystem/go-grpc-middleware/logging package provides ready-made integrations with zap, logrus, and zerolog.

Metrics and Prometheus

Metrics interceptors record counters and histograms for every RPC. The standard set of metrics follows the RED method (Rate, Errors, Duration):

grpc_server_handled_total -- counter with labels for method, service, and status code
grpc_server_handling_seconds -- histogram of RPC latencies
grpc_server_msg_received_total and grpc_server_msg_sent_total -- message counts for streaming RPCs

var (
    rpcCounter = prometheus.NewCounterVec(
        prometheus.CounterOpts{
            Name: "grpc_server_handled_total",
            Help: "Total number of RPCs completed",
        },
        []string{"grpc_method", "grpc_code"},
    )
    rpcDuration = prometheus.NewHistogramVec(
        prometheus.HistogramOpts{
            Name:    "grpc_server_handling_seconds",
            Help:    "Histogram of RPC handling durations",
            Buckets: prometheus.DefBuckets,
        },
        []string{"grpc_method"},
    )
)

func metricsInterceptor(
    ctx context.Context,
    req interface{},
    info *grpc.UnaryServerInfo,
    handler grpc.UnaryHandler,
) (interface{}, error) {
    start := time.Now()
    resp, err := handler(ctx, req)
    code := status.Code(err)
    rpcCounter.WithLabelValues(info.FullMethod, code.String()).Inc()
    rpcDuration.WithLabelValues(info.FullMethod).Observe(time.Since(start).Seconds())
    return resp, err
}

The grpc-ecosystem/go-grpc-prometheus package provides a production-ready implementation that covers both unary and streaming RPCs with all the standard labels. For stream interceptors, it counts individual messages sent and received per stream.

Authentication

Authentication interceptors extract credentials from gRPC metadata (the equivalent of HTTP headers), validate them, and inject the authenticated identity into the context for downstream handlers. The standard pattern uses the authorization metadata key with a Bearer token:

func authInterceptor(
    ctx context.Context,
    req interface{},
    info *grpc.UnaryServerInfo,
    handler grpc.UnaryHandler,
) (interface{}, error) {
    // Skip auth for health checks
    if info.FullMethod == "/grpc.health.v1.Health/Check" {
        return handler(ctx, req)
    }

    md, ok := metadata.FromIncomingContext(ctx)
    if !ok {
        return nil, status.Error(codes.Unauthenticated, "missing metadata")
    }

    tokens := md.Get("authorization")
    if len(tokens) == 0 {
        return nil, status.Error(codes.Unauthenticated, "missing auth token")
    }

    userID, err := validateToken(tokens[0])
    if err != nil {
        return nil, status.Error(codes.Unauthenticated, "invalid token")
    }

    // Attach user identity to context
    ctx = context.WithValue(ctx, userIDKey, userID)
    return handler(ctx, req)
}

For mTLS-based authentication, the interceptor extracts the client certificate from the TLS peer information in the context rather than reading metadata. gRPC also supports per-RPC credentials via the credentials.PerRPCCredentials interface, which automatically attaches metadata on every call from the client side.

Rate Limiting

Rate limiting interceptors prevent clients from overwhelming the server. They can operate at multiple granularities: per-client (identified by auth token or peer IP), per-method, or globally. A token bucket or sliding window algorithm works well here:

func rateLimitInterceptor(limiter *rate.Limiter) grpc.UnaryServerInterceptor {
    return func(
        ctx context.Context,
        req interface{},
        info *grpc.UnaryServerInfo,
        handler grpc.UnaryHandler,
    ) (interface{}, error) {
        if !limiter.Allow() {
            return nil, status.Error(codes.ResourceExhausted, "rate limit exceeded")
        }
        return handler(ctx, req)
    }
}

The correct gRPC status code for rate limiting is ResourceExhausted (code 8). Well-behaved clients will back off and retry when they receive this code. You can also include retry-after information in the response trailing metadata to give clients an explicit backoff duration.

Validation

Validation interceptors check that incoming requests meet structural or business constraints before the handler runs. Rather than scattering validation logic across every RPC handler, a single interceptor can enforce it uniformly. The most common approach uses the protoc-gen-validate (PGV) plugin or the newer protovalidate library, which lets you define validation rules directly in your .proto files:

// In your .proto file:
message CreateUserRequest {
    string email = 1 [(buf.validate.field).string.email = true];
    string name = 2 [(buf.validate.field).string = {min_len: 1, max_len: 100}];
    int32 age = 3 [(buf.validate.field).int32 = {gte: 0, lte: 150}];
}

func validationInterceptor(
    ctx context.Context,
    req interface{},
    info *grpc.UnaryServerInfo,
    handler grpc.UnaryHandler,
) (interface{}, error) {
    if v, ok := req.(interface{ Validate() error }); ok {
        if err := v.Validate(); err != nil {
            return nil, status.Error(codes.InvalidArgument, err.Error())
        }
    }
    return handler(ctx, req)
}

This pattern has an important advantage: the validation rules are defined in the proto schema, shared across all languages, and enforced automatically by the interceptor without any per-method code.

Recovery / Panic Handling

In Go (and similar languages), an unhandled panic in an RPC handler will crash the entire server process. A recovery interceptor catches these panics and converts them to Internal gRPC errors, keeping the server running:

func recoveryInterceptor(
    ctx context.Context,
    req interface{},
    info *grpc.UnaryServerInfo,
    handler grpc.UnaryHandler,
) (resp interface{}, err error) {
    defer func() {
        if r := recover(); r != nil {
            log.Printf("panic recovered in %s: %v\n%s",
                info.FullMethod, r, debug.Stack())
            err = status.Errorf(codes.Internal, "internal server error")
        }
    }()
    return handler(ctx, req)
}

This interceptor should always be the outermost in the chain. If a panic occurs in any inner interceptor or in the handler itself, the recovery interceptor catches it. The grpc-ecosystem/go-grpc-middleware/recovery package provides a battle-tested version with customizable recovery functions.

Metadata Propagation

gRPC metadata is the mechanism for passing key-value pairs alongside RPCs, analogous to HTTP headers. Interceptors are the primary consumers and producers of metadata. There are two types:

Request metadata (headers) -- sent by the client before the RPC body, readable on the server via metadata.FromIncomingContext(ctx)
Response metadata -- split into headers (sent before the first response message) and trailers (sent after the last response message)

Metadata keys are case-insensitive strings. Keys ending in -bin have binary values (base64-encoded on the wire). Common metadata keys include:

authorization -- bearer tokens or API keys
x-request-id -- correlation IDs for distributed tracing
grpc-timeout -- deadline propagation (set automatically by the framework)
x-forwarded-for -- client IP when behind a proxy

In a microservices architecture, you often need to propagate metadata from incoming requests to outgoing calls. This is how distributed trace context flows through a system. An interceptor can automate this propagation:

func propagationInterceptor(
    ctx context.Context,
    req interface{},
    info *grpc.UnaryServerInfo,
    handler grpc.UnaryHandler,
) (interface{}, error) {
    md, _ := metadata.FromIncomingContext(ctx)

    // Extract trace headers and store in context for outgoing calls
    if traceIDs := md.Get("x-trace-id"); len(traceIDs) > 0 {
        ctx = metadata.AppendToOutgoingContext(ctx, "x-trace-id", traceIDs[0])
    }
    if spanIDs := md.Get("x-span-id"); len(spanIDs) > 0 {
        newSpanID := generateSpanID()
        ctx = metadata.AppendToOutgoingContext(ctx,
            "x-parent-span-id", spanIDs[0],
            "x-span-id", newSpanID,
        )
    }

    return handler(ctx, req)
}

OpenTelemetry provides official gRPC interceptors that handle trace context propagation automatically using the W3C Trace Context standard. In production, prefer these over hand-rolled propagation.

Context Values and Deadlines

The context.Context passed through gRPC interceptors carries two critical pieces of information: values (arbitrary key-value data attached by interceptors) and deadlines (when the RPC must complete by).

Context Values

Interceptors commonly attach data to the context so that downstream handlers and other interceptors can access it. The auth interceptor example above attaches a user ID. Other examples include attaching a logger with pre-populated fields, a database transaction, or feature flags.

// Auth interceptor sets this
ctx = context.WithValue(ctx, userIDKey, userID)

// Handler retrieves it
userID := ctx.Value(userIDKey).(string)

Context values should be used judiciously. They are untyped and invisible in function signatures, which makes code harder to reason about. In general, use them for cross-cutting concerns (auth identity, trace context, request-scoped loggers) rather than business data.

Deadlines

gRPC has built-in deadline propagation. When a client sets a deadline (or timeout), it is transmitted to the server in the grpc-timeout metadata header. The server's context automatically has this deadline set. If the client specifies a 5-second timeout, the server's ctx.Deadline() returns the corresponding wall-clock time.

Interceptors can enforce or modify deadlines:

func deadlineInterceptor(maxDuration time.Duration) grpc.UnaryServerInterceptor {
    return func(
        ctx context.Context,
        req interface{},
        info *grpc.UnaryServerInfo,
        handler grpc.UnaryHandler,
    ) (interface{}, error) {
        // If no deadline set, or deadline is too far out, cap it
        deadline, ok := ctx.Deadline()
        if !ok || time.Until(deadline) > maxDuration {
            var cancel context.CancelFunc
            ctx, cancel = context.WithTimeout(ctx, maxDuration)
            defer cancel()
        }
        return handler(ctx, req)
    }
}

When a server makes outgoing gRPC calls to other services, the remaining deadline is automatically propagated. If the client gave 5 seconds, the server spent 1 second on processing, the outgoing call gets a context with roughly 4 seconds remaining. This cascading deadline propagation prevents runaway fan-out from consuming resources indefinitely.

Interceptors in Go (grpc-middleware)

The Go gRPC ecosystem has the most mature interceptor tooling. The grpc-ecosystem/go-grpc-middleware v2 library provides a comprehensive set of production-ready interceptors:

logging -- structured logging with zap, logrus, zerolog, or slog
recovery -- panic recovery with customizable handlers
auth -- pluggable authentication with a simple AuthFunc interface
ratelimit -- server-side rate limiting with a pluggable limiter interface
validator -- request validation via protoc-gen-validate
retry -- client-side retries with configurable backoff and retryable codes
timeout -- per-method deadline enforcement

A production server in Go commonly looks like this:

import (
    "github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/logging"
    "github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/recovery"
    "github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/auth"
    "go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc"
)

server := grpc.NewServer(
    grpc.StatsHandler(otelgrpc.NewServerHandler()),  // OpenTelemetry tracing
    grpc.ChainUnaryInterceptor(
        recovery.UnaryServerInterceptor(),
        logging.UnaryServerInterceptor(logger),
        auth.UnaryServerInterceptor(authFunc),
        selector.UnaryServerInterceptor(
            auth.UnaryServerInterceptor(adminAuthFunc),
            selector.MatchFunc(isAdminMethod),
        ),
    ),
    grpc.ChainStreamInterceptor(
        recovery.StreamServerInterceptor(),
        logging.StreamServerInterceptor(logger),
        auth.StreamServerInterceptor(authFunc),
    ),
)

The selector package enables conditional interceptor application. In this example, the admin auth interceptor only runs for methods matching isAdminMethod. This avoids applying expensive or restrictive interceptors to methods that do not need them (like health checks or public endpoints).

Interceptors in Java (ServerInterceptor)

Java gRPC uses the ServerInterceptor interface with a single method: interceptCall. Unlike Go, Java does not split into unary vs stream interceptors -- a single interface handles both. The interceptor wraps the ServerCallHandler and returns a ServerCall.Listener that can intercept incoming messages, headers, and completion events.

public class LoggingInterceptor implements ServerInterceptor {
    @Override
    public <ReqT, RespT> ServerCall.Listener<ReqT> interceptCall(
            ServerCall<ReqT, RespT> call,
            Metadata headers,
            ServerCallHandler<ReqT, RespT> next) {

        String method = call.getMethodDescriptor().getFullMethodName();
        long start = System.nanoTime();

        // Wrap the call to intercept responses
        ServerCall<ReqT, RespT> wrappedCall = new ForwardingServerCall
                .SimpleForwardingServerCall<ReqT, RespT>(call) {
            @Override
            public void close(Status status, Metadata trailers) {
                long duration = System.nanoTime() - start;
                logger.info("method={} status={} duration={}ms",
                    method, status.getCode(), duration / 1_000_000);
                super.close(status, trailers);
            }
        };

        // Wrap the listener to intercept requests
        ServerCall.Listener<ReqT> listener = next.startCall(wrappedCall, headers);
        return new ForwardingServerCallListener
                .SimpleForwardingServerCallListener<ReqT>(listener) {
            @Override
            public void onMessage(ReqT message) {
                logger.debug("received: {}", message);
                super.onMessage(message);
            }
        };
    }
}

Java interceptors are registered on the server builder and execute in reverse registration order (the last registered interceptor runs first, outermost):

Server server = ServerBuilder.forPort(8080)
    .addService(myService)
    .intercept(new LoggingInterceptor())     // runs second (inner)
    .intercept(new AuthInterceptor())        // runs first (outer)
    .build();

For context propagation, Java gRPC uses the Context class (from the io.grpc package, not java.util.concurrent). Keys are defined statically and values are attached via Context.current().withValue(key, value). The Contexts.interceptCall utility method handles attaching the new context to the call correctly.

Interceptors in Python

Python gRPC supports interceptors on both client and server sides. Server interceptors implement the grpc.ServerInterceptor class and override the intercept_service method. The API is less ergonomic than Go or Java -- the handler continuation returns a RpcMethodHandler tuple that you need to unwrap and rewrap:

class LoggingInterceptor(grpc.ServerInterceptor):
    def intercept_service(self, continuation, handler_call_details):
        method = handler_call_details.method
        start = time.time()

        # Get the actual handler
        handler = continuation(handler_call_details)
        if handler is None:
            return None

        # Wrap unary-unary handlers
        if handler.unary_unary:
            original = handler.unary_unary
            def wrapped(request, context):
                try:
                    response = original(request, context)
                    return response
                finally:
                    duration = time.time() - start
                    logging.info(f"method={method} duration={duration:.3f}s")

            return grpc.unary_unary_rpc_method_handler(
                wrapped,
                request_deserializer=handler.request_deserializer,
                response_serializer=handler.response_serializer,
            )

        return handler

Client-side interceptors in Python are more straightforward. You implement grpc.UnaryUnaryClientInterceptor (and similar for streaming variants) and pass them to grpc.intercept_channel:

class RetryInterceptor(grpc.UnaryUnaryClientInterceptor):
    def intercept_unary_unary(self, continuation, client_call_details, request):
        for attempt in range(3):
            response = continuation(client_call_details, request)
            result = response.result()
            if response.code() not in (grpc.StatusCode.UNAVAILABLE,):
                return response
            time.sleep(0.1 * (2 ** attempt))
        return response

channel = grpc.intercept_channel(
    grpc.insecure_channel('localhost:50051'),
    RetryInterceptor(),
)

Python's async gRPC (grpcio with asyncio) supports interceptors with the same interface but using async/await syntax.

Interceptors in Rust (tonic)

Rust's gRPC ecosystem centers on the tonic crate, which builds on tower's middleware system rather than implementing gRPC-specific interceptors. This means Rust gRPC middleware uses the same tower::Service and tower::Layer abstractions used by the rest of the Rust async ecosystem (including axum and hyper).

For simple interceptors, tonic provides the Interceptor trait and the InterceptedService wrapper:

use tonic::{Request, Status};

fn auth_interceptor(req: Request<()>) -> Result<Request<()>, Status> {
    match req.metadata().get("authorization") {
        Some(token) if validate_token(token) => Ok(req),
        _ => Err(Status::unauthenticated("invalid or missing token")),
    }
}

// Apply to a service
let svc = MyServiceServer::with_interceptor(my_service, auth_interceptor);

For more complex middleware (wrapping responses, measuring latency, intercepting streams), you use Tower layers directly:

use tower::{ServiceBuilder, Layer, Service};
use std::task::{Context, Poll};
use std::pin::Pin;
use std::future::Future;

#[derive(Clone)]
struct MetricsLayer;

impl<S> Layer<S> for MetricsLayer {
    type Service = MetricsService<S>;
    fn layer(&self, inner: S) -> Self::Service {
        MetricsService { inner }
    }
}

#[derive(Clone)]
struct MetricsService<S> {
    inner: S,
}

impl<S, B> Service<http::Request<B>> for MetricsService<S>
where
    S: Service<http::Request<B>> + Clone + Send + 'static,
    S::Future: Send + 'static,
{
    type Response = S::Response;
    type Error = S::Error;
    type Future = Pin<Box<dyn Future<Output = Result<S::Response, S::Error>> + Send>>;

    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
        self.inner.poll_ready(cx)
    }

    fn call(&mut self, req: http::Request<B>) -> Self::Future {
        let mut inner = self.inner.clone();
        Box::pin(async move {
            let start = std::time::Instant::now();
            let resp = inner.call(req).await;
            let elapsed = start.elapsed();
            // record metrics...
            resp
        })
    }
}

The Tower approach is more verbose than Go's functional interceptors, but it composes naturally with the rest of the Rust ecosystem and is fully type-safe at compile time. You can use tower::ServiceBuilder to chain multiple layers:

let layer = ServiceBuilder::new()
    .layer(MetricsLayer)
    .layer(tonic::service::interceptor(auth_interceptor))
    .into_inner();

let server = Server::builder()
    .layer(layer)
    .add_service(MyServiceServer::new(my_service))
    .serve(addr)
    .await?;

Comparison to HTTP Middleware

If you have written middleware for HTTP frameworks (Express, Django, Axum, Gin), gRPC interceptors will feel familiar but differ in several important ways:

The most significant conceptual difference is the unary/stream split. HTTP middleware always sees a request and produces a response -- streaming is handled at a lower level (chunked transfer, WebSockets) and usually transparent to middleware. gRPC interceptors must explicitly handle both patterns because the interception surface differs: a unary interceptor can inspect the full request object, while a stream interceptor can only wrap the stream and intercept messages as they flow.

Another important difference is deadline propagation. HTTP has no native concept of request deadlines. Timeouts are typically implemented per-hop (Nginx's proxy_read_timeout, for example) and do not automatically propagate through service chains. gRPC deadlines are a first-class concept: set once by the client, transmitted in metadata, and automatically applied to every downstream call's context. This makes gRPC interceptors naturally deadline-aware, while HTTP middleware needs external mechanisms (custom headers, OpenTelemetry baggage) to achieve the same effect.

The typed status codes also matter for interceptor design. HTTP middleware often needs to parse response bodies to determine if an error occurred (a 200 response might contain an error payload in REST APIs). gRPC has a well-defined set of 16 status codes, and every interceptor can switch on status.Code(err) to make decisions about retries, metrics labels, and error handling.

Advanced Patterns

Per-Method Interceptors

Not every interceptor should run on every method. Health check endpoints should not require authentication. Admin methods might need elevated authorization. The selector pattern (available in Go's grpc-middleware) lets you conditionally apply interceptors based on the method name:

// Only apply rate limiting to write methods
selector.UnaryServerInterceptor(
    rateLimitInterceptor(writeLimiter),
    selector.MatchFunc(func(fullMethod string) bool {
        return strings.HasPrefix(fullMethod, "/myservice.WriteService/")
    }),
)

Error Enrichment

An error enrichment interceptor transforms errors from inner layers into richer gRPC status errors with detail protos attached. This centralizes error formatting and ensures clients always receive structured, actionable error information:

func errorEnrichmentInterceptor(
    ctx context.Context,
    req interface{},
    info *grpc.UnaryServerInfo,
    handler grpc.UnaryHandler,
) (interface{}, error) {
    resp, err := handler(ctx, req)
    if err != nil {
        st, _ := status.FromError(err)
        // Add debug info, request ID, help links
        st, _ = st.WithDetails(&errdetails.DebugInfo{
            Detail: fmt.Sprintf("request_id=%s method=%s",
                requestID(ctx), info.FullMethod),
        })
        return nil, st.Err()
    }
    return resp, nil
}

Request/Response Transformation

Interceptors can transform requests before they reach the handler or responses before they reach the client. This is useful for field masking (removing sensitive fields from logs), request normalization (trimming whitespace, normalizing email addresses), or response enrichment (adding server-side timestamps).

Idempotency Keys

For non-idempotent RPCs, an interceptor can implement idempotency by extracting a client-provided idempotency key from metadata, checking if the key has been seen before, and either returning the cached response or forwarding to the handler and caching the result. This pattern is especially valuable when clients retry after network failures and you need to prevent duplicate side effects.

Testing Interceptors

Interceptors should be tested in isolation, without spinning up a full gRPC server. Since interceptors are just functions (in Go) or implementations of an interface (in Java/Rust), you can call them directly with mock handlers:

func TestAuthInterceptor_MissingToken(t *testing.T) {
    handler := func(ctx context.Context, req interface{}) (interface{}, error) {
        t.Fatal("handler should not be called")
        return nil, nil
    }

    ctx := context.Background()
    // No metadata attached -- should fail
    _, err := authInterceptor(ctx, nil, &grpc.UnaryServerInfo{
        FullMethod: "/test.Service/Method",
    }, handler)

    st, _ := status.FromError(err)
    if st.Code() != codes.Unauthenticated {
        t.Errorf("expected Unauthenticated, got %v", st.Code())
    }
}

func TestAuthInterceptor_ValidToken(t *testing.T) {
    handlerCalled := false
    handler := func(ctx context.Context, req interface{}) (interface{}, error) {
        handlerCalled = true
        // Verify user ID was injected into context
        userID := ctx.Value(userIDKey).(string)
        if userID != "user-123" {
            t.Errorf("expected user-123, got %s", userID)
        }
        return &pb.Response{}, nil
    }

    md := metadata.New(map[string]string{
        "authorization": "Bearer valid-token-for-user-123",
    })
    ctx := metadata.NewIncomingContext(context.Background(), md)

    _, err := authInterceptor(ctx, nil, &grpc.UnaryServerInfo{
        FullMethod: "/test.Service/Method",
    }, handler)

    if err != nil {
        t.Errorf("unexpected error: %v", err)
    }
    if !handlerCalled {
        t.Error("handler was not called")
    }
}

For integration testing, bufconn (in Go) provides an in-memory gRPC connection that avoids binding to a real network port, making tests fast and parallelizable. The interceptor chain runs exactly as it would in production, including metadata propagation and deadline handling.

Performance Considerations

Every interceptor in the chain adds latency to every RPC. For high-throughput services handling tens of thousands of RPCs per second, interceptor overhead matters. A few guidelines:

Avoid allocations in the hot path. Pre-allocate metric labels, reuse loggers, and avoid creating closures per-request when possible.
Use async logging. Synchronous log writes that block on I/O can add milliseconds to every RPC. Buffer and flush asynchronously.
Short-circuit early. If an auth check fails, return immediately. Do not proceed to validate the request or record detailed metrics for rejected calls.
Profile your chain. In Go, use pprof to measure where time is spent. Interceptor overhead should be under 100 microseconds total for a typical chain. For pure metrics collection, consider stats.Handler instead of interceptors -- it operates at a lower level in the gRPC stack, avoids the wrapper allocation overhead, and provides structured hook points (HandleRPC, TagRPC) designed specifically for observability.
Be careful with stream interceptors. A stream interceptor's RecvMsg/SendMsg wrappers run on every message. If your stream exchanges thousands of messages, even microsecond-level overhead compounds.

Interceptors are the backbone of production gRPC infrastructure. They enforce the principle of separation of concerns: business logic lives in handlers, while infrastructure concerns -- authentication, observability, resilience -- live in the interceptor chain. Getting the chain composition right (order, conditional application, error handling) is one of the most impactful decisions in a gRPC architecture. For further reading, see how interceptors integrate with gRPC security and error handling patterns.