How Segment Routing Works: SR-MPLS and SRv6 Source Routing
Segment Routing (SR) is a source-routing architecture that steers packets through a network by encoding an ordered list of instructions — called segments — in the packet header at the point of ingress. Each segment represents a topological waypoint (a node, a link, a service) or an instruction (e.g., "decapsulate" or "apply policy"). The source router selects the path by composing a segment list, and each intermediate node processes one segment at a time, forwarding the packet toward the next segment. Segment Routing eliminates the need for per-flow state in the network core, replacing hop-by-hop signaling protocols like LDP and RSVP-TE with a source-routed model where all path intelligence resides at the network edge. SR is defined across a family of RFCs including RFC 8402 (architecture), RFC 8660 (SR-MPLS), and RFC 8986 (SRv6).
Segment Routing has become the dominant network architecture for modern ISP and data center fabrics. It builds directly on the IGP (IS-IS or OSPF) to distribute segment identifiers, integrates with BGP for inter-domain traffic engineering, and supports both the MPLS data plane (SR-MPLS) and the IPv6 data plane (SRv6). It solves the scalability problems of traditional MPLS traffic engineering while providing new capabilities like network slicing and service chaining.
Why Segment Routing Was Needed
Traditional MPLS networks rely on multiple control-plane protocols that create per-flow state on every router in the path:
- LDP (Label Distribution Protocol): Distributes labels for IGP shortest paths. Every router maintains LDP sessions with its neighbors and allocates labels for every prefix. In a network with N routers and P prefixes, the total label state is O(N × P).
- RSVP-TE (Resource Reservation Protocol with Traffic Engineering): Signals explicitly routed LSPs with bandwidth reservations. Each tunnel creates per-hop state on every router along the path. In a full-mesh of tunnels between N PE routers, the total RSVP state on each transit router is O(N²). Managing thousands of RSVP-TE tunnels is operationally burdensome and prone to state synchronization failures.
- Make-before-break rerouting: When RSVP-TE needs to re-optimize a path, it must signal a new LSP before tearing down the old one, creating transient double state.
Segment Routing eliminates LDP entirely and replaces RSVP-TE for most traffic engineering use cases. Labels are derived from segment identifiers that are distributed by the IGP — the same protocol that already provides the topology. No additional signaling protocol is needed. Transit routers maintain no per-tunnel state; they simply process segments encoded in the packet header by the ingress router.
Segment Types
Segment Routing defines several types of segments, each representing a different forwarding instruction. The most fundamental are:
Prefix Segment (Prefix SID)
A Prefix SID is a globally significant segment identifier associated with an IP prefix, typically a router's loopback address. The Prefix SID instructs the network to forward the packet along the shortest IGP path to that prefix. Every router in the SR domain computes the same shortest path for a given Prefix SID, so the forwarding behavior is deterministic.
Prefix SIDs are allocated from a global range called the Segment Routing Global Block (SRGB). For SR-MPLS, the SRGB is a range of MPLS label values (e.g., 16000-23999). Each router advertises its SRGB via the IGP. A Prefix SID is expressed as an index into the SRGB — for example, a Prefix SID index of 100 maps to MPLS label 16100 on a router with SRGB starting at 16000. Importantly, all routers in the domain should have the same SRGB for operational simplicity, though the architecture supports per-node SRGB ranges.
Adjacency Segment (Adj-SID)
An Adjacency SID is a locally significant segment identifier associated with a specific adjacency (link) on a router. The Adj-SID instructs the receiving router to forward the packet over a specific link, regardless of the IGP shortest path. Adj-SIDs are allocated from the Segment Routing Local Block (SRLB), a range of labels local to each router.
Adj-SIDs enable explicit routing: by encoding a sequence of Adj-SIDs, the ingress router can force a packet to traverse a specific set of links, bypassing the IGP shortest path. This provides traffic engineering capability equivalent to RSVP-TE explicit paths but without any per-tunnel signaling.
Binding Segment (Binding SID)
A Binding SID binds a segment identifier to a policy (a segment list or a set of weighted segment lists). When a router receives a packet with a Binding SID at the top of the segment list, it pops the Binding SID and pushes the bound segment list. This provides scalability by abstracting complex segment lists behind a single SID. Binding SIDs are used for SR Policy stitching across domains and for hiding internal topology details.
SR-MPLS: Segment Routing over MPLS
SR-MPLS uses the existing MPLS data plane to encode segments. Each segment is represented as an MPLS label in a label stack. The ingress router pushes a stack of MPLS labels representing the segment list, and each transit router pops the top label and forwards the packet according to the label value.
SR-MPLS requires no changes to the MPLS forwarding plane — existing MPLS-capable hardware works unchanged. The key change is in the control plane: labels are distributed by the IGP (IS-IS or OSPF with SR extensions) rather than by LDP or RSVP-TE.
How SR-MPLS Forwarding Works
Consider a packet from R1 to R4 via the explicit path R1 → R2 → R3 → R4, where R4 has Prefix SID index 4 (label 16004 with SRGB base 16000):
- R1 (ingress): Pushes label stack {16004}. Since R2 is the IGP next-hop toward R4, R1 knows that R2 will understand label 16004 as "forward toward R4's loopback."
- R2 (transit): Receives packet with top label 16004. R2 looks up label 16004 in its MPLS forwarding table: the action is SWAP to 16004 (the label for SID 4 is the same on all routers with the same SRGB) and forward to R3, the next IGP hop toward R4. In practice, if R3 is the penultimate hop, R2 may POP (Penultimate Hop Popping) instead of swapping.
- R3 (penultimate): Receives packet with top label 16004. Being the penultimate hop, R3 POPs the label and forwards the bare IP packet (or the remaining label stack) to R4.
- R4 (egress): Receives the packet and performs normal IP forwarding (or pops the next label if there is a service label below).
This is functionally identical to traditional MPLS forwarding, but the labels were computed from the IGP topology and SID assignments rather than signaled by LDP or RSVP-TE.
SRv6: Segment Routing over IPv6
SRv6 (RFC 8986) is the IPv6 instantiation of Segment Routing. Instead of MPLS labels, segments are encoded as IPv6 addresses in the Segment Routing Header (SRH), an IPv6 extension header (routing header type 4). Each segment is a 128-bit IPv6 address, and the segment list is carried as an ordered list of IPv6 addresses in the SRH.
SRv6 uses a concept called the SRv6 SID, which is structured as:
- Locator: The routing prefix portion of the SID (e.g., a /48 prefix assigned to the node). The IGP routes traffic toward this locator.
- Function: The instruction to execute when the packet arrives at the node (e.g.,
End= advance to next segment,End.DT4= decapsulate and do IPv4 lookup in a VRF,End.DX6= decapsulate and forward to a specific interface). - Arguments: Optional parameters for the function (e.g., a VRF identifier or interface index).
For example, the SRv6 SID 2001:db8:A:100:: might decode as: locator 2001:db8:A::/48, function 0x100 (mapped to End.DT4, meaning "decapsulate and perform IPv4 lookup in VRF"). The packet is routed via standard IPv6 forwarding to the node owning the 2001:db8:A::/48 prefix, where the function is executed.
SRv6 Advantages
- No MPLS dependency: SRv6 uses the native IPv6 forwarding plane. Any IPv6-capable router can forward SRv6 traffic (it simply routes toward the destination IPv6 address). Only the nodes that need to process the SRH need to understand SRv6.
- Network programming: SRv6 functions go beyond simple forwarding. Functions like
End.B6(insert a new SRH),End.AM(proxy for service function chaining), andEnd.DT46(decapsulate and route in both IPv4 and IPv6 VRFs) enable complex network programs expressed as segment lists. - Native IPv6 integration: SRv6 packets are valid IPv6 packets. They can traverse networks that do not support SRv6 (transit routers just route on the outer IPv6 destination). This enables incremental deployment.
SRv6 Challenges
- Overhead: Each SRv6 segment is a 128-bit IPv6 address (16 bytes), compared to 4 bytes for an MPLS label. A segment list of 5 segments adds 80 bytes of SRH overhead plus the outer IPv6 header. This is significant for small packets (e.g., VoIP). Compression techniques like uSID (micro-SID) address this by packing multiple short SIDs into a single 128-bit IPv6 address.
- Hardware support: Processing the SRH requires hardware pipeline support that not all ASICs provide. Early SRv6 implementations were software-only; modern ASICs (Broadcom Memory Memory, Cisco Silicon One) support SRv6 at line rate.
- MTU: The SRH overhead increases packet size, which can trigger fragmentation or path MTU issues. Networks deploying SRv6 must ensure adequate MTU headroom (typically 9000+ byte jumbo frames in the underlay).
SR Traffic Engineering
SR Policy (defined in draft-ietf-spring-segment-routing-policy) is the mechanism for applying traffic engineering in Segment Routing networks. An SR Policy consists of:
- Head-end: The router that steers traffic into the policy.
- Color: A numeric identifier that associates the policy with a specific intent (e.g., low-latency, high-bandwidth, disjoint from another path).
- Endpoint: The destination of the policy (a remote PE or service node).
- Candidate paths: One or more segment lists that implement the policy. Each candidate path has a preference; the highest-preference valid path is active.
SR Policies can be created by:
- Local configuration: The head-end is manually configured with the segment list.
- PCE (Path Computation Element): A centralized controller (RFC 8231, RFC 8281) computes paths based on the network topology, constraints, and objectives, then programs SR Policies on head-end routers via PCEP (Path Computation Element Protocol). The PCE has a global view of the network and can compute optimal paths, disjoint paths, and bandwidth-constrained paths that individual routers cannot compute with only local topology knowledge.
- BGP SR Policy: BGP can distribute SR Policies between autonomous systems or from a controller to head-end routers using the BGP SR Policy SAFI.
TI-LFA: Fast Reroute for Segment Routing
Topology-Independent Loop-Free Alternate (TI-LFA) is the fast-reroute mechanism for Segment Routing networks. TI-LFA computes backup paths that protect against any single link or node failure, providing sub-50ms failover without requiring dedicated backup tunnels (unlike MPLS FRR, which requires pre-signaled RSVP-TE backup LSPs).
TI-LFA works by pre-computing, for each destination, a post-convergence path — the path that the IGP would compute after the failure. The protecting router (the Point of Local Repair, or PLR) encodes this post-convergence path as a segment list and installs it as a backup in the FIB. When BFD detects a failure, the PLR immediately activates the backup segment list, forwarding packets along the post-convergence path.
The key advantage of TI-LFA is that it is topology-independent: it can compute a backup path for any single failure in any topology, without requiring the operator to design backup tunnels or ensure specific topological properties. Traditional MPLS FRR required pre-signaled RSVP-TE backup LSPs for each protected interface, and LFA (Loop-Free Alternate, RFC 5286) could only protect links where a suitable alternate next-hop existed — which was not guaranteed in all topologies.
TI-LFA computes the backup path by:
- Computing the post-convergence SPF tree (the tree that would exist after the failure).
- Finding a P node (a node on the post-convergence path) and a Q node (the node where the post-convergence path intersects the pre-convergence path). In many cases, P and Q are the same node.
- Encoding a segment list that directs traffic to the P node (via a Prefix SID), then along the post-convergence path to the destination. The segment list might include an Adj-SID if the PLR needs to steer traffic onto a specific link to reach the P node.
Flexible Algorithm (Flex-Algo)
Flexible Algorithm (RFC 9350) extends Segment Routing to support multiple parallel routing computations within the same IGP instance. Each Flex-Algo is identified by a number (128-255) and defines:
- Calculation type: The algorithm to use (e.g., SPF for shortest path, strict SPF, or a vendor-defined algorithm).
- Metric type: The metric to optimize (IGP metric, TE metric, latency, or other link attributes).
- Constraints: Which links to include or exclude (based on administrative groups/affinities, SRLGs, or other link properties).
For example, Flex-Algo 128 might compute shortest paths based on latency rather than IGP metric, while Flex-Algo 129 computes paths that avoid links marked with the "high-latency" affinity. Each Flex-Algo has its own set of Prefix SIDs (derived from the same prefix but with a different algorithm ID), so the ingress router can steer traffic into a specific Flex-Algo by pushing the corresponding Prefix SID.
Flex-Algo enables network slicing: different applications or tenants can be assigned to different Flex-Algos, each with its own path computation constraints. A real-time video application might use a low-latency Flex-Algo, while bulk data transfer uses a shortest-path (least-cost) Flex-Algo. All of this runs on the same physical infrastructure without additional overlay tunnels.
SR-MPLS vs. SRv6: Deployment Considerations
Both SR-MPLS and SRv6 implement the same Segment Routing architecture but differ in their data-plane representation:
- SR-MPLS: Uses existing MPLS data plane. Widely supported on all existing MPLS-capable hardware. Segments are 4-byte MPLS labels. Minimal overhead. Best choice for brownfield networks already running MPLS or for environments where packet overhead matters (e.g., mobile backhaul with small VoIP packets). Does not require IPv6.
- SRv6: Uses IPv6 data plane. Requires IPv6-capable infrastructure but not MPLS. Segments are 128-bit IPv6 addresses (16 bytes each). Higher overhead but provides "network programming" capabilities (rich function semantics in SIDs). Best choice for greenfield IPv6 networks, cloud providers, and environments where the IPv6 underlay is already deployed. uSID compression reduces overhead significantly.
Many large operators (Google, Meta, Softbank, several Tier-1 ISPs) have deployed SRv6 in production. SR-MPLS remains more widely deployed overall due to the massive installed base of MPLS infrastructure. Some operators run both: SR-MPLS in the MPLS-based WAN and SRv6 in IPv6-native data center fabrics.
Segment Routing and BGP
Segment Routing integrates deeply with BGP in several ways:
- BGP Prefix SID: BGP can carry a Prefix SID attribute (RFC 8669) that associates an SR SID with a BGP prefix. This enables end-to-end SR paths across multiple IGP domains, with BGP stitching the SR label stacks at domain boundaries.
- BGP-LU (Labeled Unicast): In inter-AS MPLS deployments, BGP Labeled Unicast distributes labels between autonomous systems. SR-MPLS can use BGP-LU for inter-domain label distribution, with the Prefix SID attribute providing consistent SID allocation.
- BGP SR Policy: The BGP SR Policy SAFI enables a controller (or remote head-end) to distribute SR Policies via BGP, including the full segment list, color, and endpoint.
- BGP EPE (Egress Peer Engineering): RFC 9086 defines BGP Prefix SIDs for egress peering, allowing an ingress router to steer traffic to a specific eBGP peer (a specific peering link or peering router). This provides traffic engineering for inter-domain traffic without relying on traditional BGP path selection attributes.
See BGP Paths in Action
Segment Routing transforms how traffic is engineered within and between autonomous systems. The BGP routes you see in the global routing table are carried across networks increasingly powered by SR-MPLS and SRv6. To explore live BGP routing data — AS paths, prefix origins, and the inter-domain routing that SR traffic engineering optimizes — use the god.ad BGP Looking Glass.