How MPLS Works: Multiprotocol Label Switching Explained
Multiprotocol Label Switching (MPLS) is a high-performance forwarding technology that sits between Layer 2 (data link) and Layer 3 (network) of the OSI model. Instead of looking up each packet's destination IP address in a routing table at every hop, MPLS routers attach a short, fixed-length label to packets at the network edge and use that label to make forwarding decisions throughout the core. This label-based approach enables faster forwarding, deterministic traffic engineering, and sophisticated VPN services that would be difficult or impossible with pure IP routing alone.
MPLS was developed in the late 1990s to address the performance limitations of destination-based IP forwarding. While modern hardware has closed the raw speed gap between label switching and IP lookup, MPLS remains deeply entrenched in service provider networks because of its traffic engineering and VPN capabilities. It interacts closely with BGP, which is used to distribute VPN routing information and signal label bindings across autonomous system boundaries.
Label Switching vs. IP Routing
In traditional IP routing, every router along a path performs an independent forwarding decision. When a packet arrives, the router extracts the destination IP address, performs a longest prefix match against its routing table (which may contain over a million entries in the default-free zone), determines the next hop, and forwards the packet. This process repeats at every router.
MPLS fundamentally changes this model. The forwarding decision based on the full IP header happens only once, at the ingress edge of the MPLS domain. There, the router (called a Label Edge Router or LER) classifies the packet into a Forwarding Equivalence Class (FEC) — a group of packets that will be forwarded the same way — and pushes an MPLS label onto the packet. From that point forward, every core router (called a Label Switch Router or LSR) simply reads the short label, looks it up in a much smaller label forwarding table, swaps it for a new label, and sends the packet to the next hop. No IP header inspection is needed in the core.
A Forwarding Equivalence Class can be defined by many criteria: a destination prefix, a combination of source and destination, a QoS class, or a VPN identifier. This flexibility is what makes MPLS "multiprotocol" — the label abstraction can carry any network-layer protocol, not just IPv4.
The MPLS Label Format
The MPLS label is a 32-bit header inserted between the Layer 2 (e.g., Ethernet) header and the Layer 3 (IP) header. This position is why MPLS is sometimes called a "Layer 2.5" protocol. The 32-bit label header is divided into four fields:
Label (20 bits)
The label value itself, ranging from 0 to 1,048,575. Labels 0 through 15 are reserved for special purposes. Label 0 is the IPv4 Explicit Null label, telling the receiving router to pop the label and do an IPv4 lookup. Label 2 is the IPv6 Explicit Null. Label 3 is the Implicit Null label, used in signaling (but never actually appears on the wire) to request penultimate hop popping. Labels 16 and above are assigned dynamically by label distribution protocols.
Traffic Class (TC) — 3 bits
Originally called the "Experimental" (EXP) bits, this field was renamed to Traffic Class by RFC 5462. It carries QoS information (similar to the DSCP field in the IP header), allowing MPLS networks to differentiate traffic classes and apply per-hop scheduling behaviors like priority queuing or weighted fair queuing.
Bottom of Stack (S) — 1 bit
This bit is set to 1 if this label entry is the last (bottom) label in the stack, and 0 otherwise. MPLS supports stacking multiple labels (described below), and this bit tells the router when it has reached the innermost label and the next header is the payload (typically an IP packet).
Time to Live (TTL) — 8 bits
Functions identically to the IP TTL field. It is decremented at each hop and the packet is discarded if it reaches zero. This prevents forwarding loops in the MPLS domain and enables traceroute to work through MPLS networks (when TTL propagation is enabled).
Label Switch Paths (LSPs)
A Label Switch Path (LSP) is the end-to-end path that labeled packets follow through an MPLS network. An LSP is unidirectional — a path from router A to router B is a separate LSP from the path from B to A. Along an LSP, each router performs one of three operations on the label:
- Push — add a new label onto the packet (done at the ingress LER)
- Swap — replace the top label with a new label (done at each transit LSR)
- Pop — remove the top label (done at the egress LER, or at the penultimate hop when PHP is used)
Each LSR maintains a Label Forwarding Information Base (LFIB) that maps incoming labels to outgoing labels and interfaces. When a packet arrives with label 42 on interface Gi0/0, the LFIB might say: swap label 42 for label 78, forward out interface Gi0/1. This table lookup is an exact match on a small integer — vastly simpler than a longest-prefix match on a full IP address.
Label Distribution: LDP and RSVP-TE
For MPLS to work, neighboring routers must agree on which label to use for each FEC. This is the job of label distribution protocols. Two main protocols exist:
LDP (Label Distribution Protocol)
LDP (RFC 5036) is the simpler of the two protocols. It automatically distributes labels for all prefixes in the IGP routing table. When an LSR allocates a label for a prefix, it advertises that binding to its LDP neighbors. LDP follows the IGP's shortest path — it does not create new paths, but rather labels the paths the IGP has already computed.
LDP establishes sessions between neighbors over TCP and exchanges label mappings. It operates in either downstream unsolicited mode (labels are advertised proactively) or downstream on demand mode (labels are provided only when requested). Most deployments use downstream unsolicited with independent label control, where each LSR makes its own label allocation decisions without waiting for downstream confirmation.
LDP is simple to deploy and works well for basic MPLS services like L3VPNs, but it provides no traffic engineering capability — traffic follows the IGP shortest path, period.
RSVP-TE (Resource Reservation Protocol — Traffic Engineering)
RSVP-TE (RFC 3209) extends the RSVP signaling protocol to set up explicitly routed LSPs with bandwidth reservations. Unlike LDP, RSVP-TE can create paths that deviate from the IGP shortest path, enabling traffic engineering. The head-end router computes a constrained shortest path (using CSPF — Constrained Shortest Path First) based on available bandwidth, link affinities, and administrative constraints, then signals the path hop-by-hop using RSVP PATH and RESV messages.
Key capabilities of RSVP-TE:
- Explicit routing — the head-end specifies the exact path (strict hops) or partial path (loose hops) the LSP should follow
- Bandwidth reservation — RSVP-TE reserves bandwidth along the path, and routers can reject the reservation if insufficient bandwidth exists
- Fast Reroute (FRR) — pre-computed backup paths that activate in under 50 milliseconds when a link or node fails, far faster than IGP convergence
- Make-before-break — when re-optimizing an LSP, the new path is established before the old one is torn down, preventing traffic loss
RSVP-TE is more complex to operate than LDP because each LSP must be explicitly configured or computed. In large networks, managing tens of thousands of RSVP-TE tunnels can become a significant operational burden — a problem that Segment Routing (discussed below) aims to solve.
Penultimate Hop Popping (PHP)
Penultimate Hop Popping is an optimization where the second-to-last router in the LSP removes the MPLS label before forwarding the packet to the egress router. Without PHP, the egress router would have to perform two lookups: first pop the label to find the IP packet, then do an IP lookup to determine the next hop. PHP eliminates one of these operations by having the penultimate router strip the label, so the egress router receives a plain IP packet and performs a single IP lookup.
PHP is signaled using the Implicit Null label (label 3). When the egress LER advertises label 3 for a prefix via LDP, it is telling its upstream neighbor: "Do not actually push label 3 — instead, pop the label entirely before forwarding to me." The penultimate router sees the Implicit Null binding and knows to pop rather than swap.
However, PHP has a drawback: by removing the label, it also removes the TC bits that carried QoS information. In networks where QoS differentiation must be preserved to the egress, Explicit Null (label 0 for IPv4, label 2 for IPv6) is used instead. Explicit Null tells the penultimate router to swap the label to 0 (or 2) rather than popping it, preserving the TC field for the egress router to inspect before popping.
Label Stacking
One of MPLS's most powerful features is label stacking — the ability to push multiple labels onto a packet, creating a stack of labels processed from top (outermost) to bottom (innermost). The Bottom of Stack (S) bit in each label entry indicates whether there are more labels below it.
Label stacking enables hierarchical forwarding. A common use case is MPLS VPN, where two labels are used:
- The outer (transport) label carries the packet across the MPLS core from ingress PE to egress PE. Transit routers only look at this label.
- The inner (VPN/service) label identifies the specific VPN or service instance. Only the egress PE examines this label to determine which customer VPN the packet belongs to.
More complex scenarios can use three or more labels. For example, MPLS Fast Reroute adds a third label to redirect traffic around a failure, and inter-AS VPNs may require additional labels at AS boundaries. In practice, stacks of two to three labels are common; deeper stacks are possible but rare in traditional MPLS (though Segment Routing can use much deeper stacks).
MPLS VPNs
MPLS VPN services are the primary revenue-generating application of MPLS for service providers. They allow a provider to offer isolated, private network connectivity over a shared MPLS infrastructure. Two main types exist: Layer 3 VPNs and Layer 2 VPNs.
Layer 3 VPN (L3VPN / BGP-MPLS VPN)
L3VPN, defined in RFC 4364 (originally RFC 2547), is the most widely deployed MPLS service. It provides IP routing between customer sites over the provider's MPLS backbone. Each customer gets a private routing table, and different customers can even use overlapping IP address space (e.g., multiple customers all using 10.0.0.0/8 internally).
The key components of L3VPN:
- VRF (Virtual Routing and Forwarding) — each customer is assigned a separate VRF on each PE (Provider Edge) router. The VRF maintains an isolated routing table and forwarding table. Traffic arriving from the customer is placed into the correct VRF based on the interface it arrives on.
- Route Distinguisher (RD) — a 64-bit value prepended to customer IPv4 routes to make them globally unique within BGP. A customer route of 10.0.0.0/8 with RD 65000:1 becomes a VPNv4 route of 65000:1:10.0.0.0/8, distinguishable from another customer's 10.0.0.0/8 with RD 65000:2.
- Route Target (RT) — a BGP extended community that controls which routes are imported into which VRFs. A route exported from VRF A with RT 65000:100 will be imported into any VRF configured to import that RT, enabling hub-and-spoke, full-mesh, and complex topologies.
- MP-BGP — Multi-Protocol BGP carries VPNv4/VPNv6 routes between PE routers. PE routers establish MP-iBGP sessions (often via route reflectors) to exchange VPN routing information, including the MPLS label that should be used for the inner (VPN) label.
The forwarding path uses two labels: the outer label, distributed by LDP or RSVP-TE, carries the packet across the core to the egress PE; the inner label, distributed by MP-BGP, identifies the VRF on the egress PE. Transit P (Provider) routers never see customer traffic — they simply swap the outer label. This separation is what gives MPLS VPNs their scalability: core routers carry no VPN state.
Layer 2 VPN (L2VPN)
L2VPNs provide Layer 2 (Ethernet frame) connectivity over MPLS, making geographically distant sites appear to be on the same LAN segment. Two main approaches exist:
- VPWS (Virtual Private Wire Service) — a point-to-point pseudowire connecting two customer sites. The provider encapsulates Ethernet frames in MPLS labels and transports them across the core. Defined in RFC 4447 (LDP-based signaling) or RFC 4447 with BGP-based autodiscovery.
- VPLS (Virtual Private LAN Service) — a multipoint-to-multipoint service that emulates a LAN switch. Each PE router learns MAC addresses from connected customer sites and forwards Ethernet frames across pseudowires to the appropriate destination PE. VPLS uses either LDP (RFC 4762) or BGP (RFC 4761) for signaling.
- EVPN (Ethernet VPN) — the modern evolution of VPLS, defined in RFC 7432. EVPN uses BGP to distribute MAC and IP address information, providing active-active multihoming, faster convergence, and reduced flooding compared to VPLS. EVPN-MPLS is common in service provider networks, while EVPN-VXLAN is prevalent in data centers.
Traffic Engineering with MPLS-TE
Traffic engineering (TE) is the practice of optimizing how traffic flows through a network to avoid congestion, improve utilization, and meet SLAs. In pure IP routing, traffic follows the IGP shortest path, which can lead to some links being overloaded while parallel paths sit idle. MPLS-TE solves this by allowing operators to steer traffic along explicitly chosen paths.
MPLS-TE relies on three components working together:
- IGP-TE extensions — OSPF-TE (RFC 3630) and IS-IS-TE (RFC 5305) flood link attributes like available bandwidth, link color/affinity, and SRLG (Shared Risk Link Group) membership. This gives every router a complete view of the network topology and resources.
- CSPF (Constrained Shortest Path First) — the head-end router runs a modified Dijkstra algorithm over the TE topology database, computing a path that satisfies constraints like minimum bandwidth, required affinities, and node/link exclusions.
- RSVP-TE signaling — once a path is computed, RSVP-TE signals the LSP hop-by-hop, reserving bandwidth and distributing labels.
Fast Reroute (FRR)
MPLS-TE Fast Reroute provides sub-50ms protection switching, far faster than waiting for IGP convergence (which can take seconds). Two approaches exist:
- One-to-one backup (detour) — each protected LSP has a dedicated backup path around each potential failure point.
- Facility backup (bypass tunnel) — a shared backup tunnel protects all LSPs passing through a given link or node. When a failure occurs, affected traffic is pushed into the bypass tunnel by adding another label to the stack (label stacking). This is the more scalable approach and is widely deployed.
RSVP-TE Auto-Bandwidth
Auto-bandwidth is a feature where the head-end router monitors traffic on a TE tunnel and periodically adjusts the reserved bandwidth to match actual usage. This automates capacity management — the tunnel grows when traffic increases and shrinks when it decreases, triggering make-before-break re-optimization as needed.
MPLS and BGP: How They Interact
MPLS and BGP are deeply intertwined in modern service provider networks. While LDP or RSVP-TE handle label distribution for transport within the core, BGP plays several critical roles:
- L3VPN route distribution — MP-BGP (Multi-Protocol BGP with VPNv4/VPNv6 address families) carries customer VPN routes between PE routers. Each VPN route includes an MPLS label that identifies the VRF on the egress PE.
- Inter-AS MPLS VPN — when VPN services span multiple autonomous systems, BGP distributes both routes and labels across AS boundaries. Three options exist (RFC 4364 Section 10): Option A (back-to-back VRF), Option B (labeled VPNv4 exchange at ASBR), and Option C (multihop eBGP for VPNv4 with separate label path).
- BGP Labeled Unicast (BGP-LU) — BGP can distribute MPLS labels for regular unicast prefixes (RFC 8277). This is used for seamless MPLS architectures where BGP labels provide end-to-end LSPs across IGP boundaries within a single AS or across ASes.
- EVPN — BGP carries MAC and IP reachability information for Ethernet VPN services, with MPLS as the data-plane encapsulation.
When you look up an autonomous system in a BGP looking glass, the routes you see are the external (internet) BGP routes. The MPLS labels used for internal transport and VPN services are not visible in the public BGP table — they are internal to each provider's MPLS domain. However, the AS paths you observe in the looking glass often traverse networks that use MPLS internally for their backbone transport.
MPLS vs. Segment Routing
Segment Routing (SR) is the modern successor to traditional MPLS that eliminates the need for LDP and RSVP-TE signaling protocols. Instead of establishing stateful LSPs through the network, Segment Routing encodes the forwarding path as an ordered list of segments (instructions) in the packet header itself. There are two data planes for Segment Routing:
- SR-MPLS — uses the existing MPLS data plane. Segments are encoded as MPLS labels in a label stack. This means SR-MPLS can be deployed on existing MPLS-capable hardware with no forwarding-plane changes.
- SRv6 — encodes segments as IPv6 addresses in an IPv6 Segment Routing Header (SRH). This leverages the IPv6 extension header mechanism and does not require MPLS at all.
Key differences from traditional MPLS:
| Aspect | Traditional MPLS | Segment Routing |
| Label distribution | LDP, RSVP-TE (stateful) | IGP extensions (stateless) |
| State in network | Per-LSP state at every hop | No per-flow state; path in packet header |
| Traffic engineering | RSVP-TE tunnels (thousands of states) | SR Policy (centralized controller or head-end) |
| Fast Reroute | FRR bypass/detour tunnels | Topology-Independent LFA (TI-LFA) |
| Scalability | Limited by RSVP-TE state | Linear scaling — state only at head-end |
| Signaling complexity | LDP + RSVP-TE + BGP | IGP + BGP (fewer protocols) |
Segment Routing defines two types of segments:
- Prefix SID — identifies a destination prefix. Globally unique within the SR domain. Traffic is forwarded along the shortest IGP path to that prefix (like LDP, but without LDP).
- Adjacency SID — identifies a specific link between two routers. Locally significant. Used to steer traffic over a specific link, enabling traffic engineering without RSVP-TE.
By combining prefix and adjacency SIDs in the label stack, the head-end router can express any desired path through the network. For example, a label stack of [P3-prefix-SID, P4-adj-SID, PE-B-prefix-SID] would steer traffic through P3, then over the specific link from P4, to PE-B — achieving the same result as an RSVP-TE tunnel but without any per-tunnel state in the network core.
Many service providers are migrating from traditional MPLS (LDP + RSVP-TE) to Segment Routing for reduced complexity and better scalability. Google, for example, has publicly described their SR-based WAN architecture. However, MPLS data-plane semantics (labels, label stacking, push/pop/swap) remain unchanged — SR-MPLS is, in a sense, MPLS without the signaling overhead.
MPLS in the Real World
MPLS is ubiquitous in service provider and large enterprise networks. Virtually every major ISP and transit provider uses MPLS in their backbone. When you trace a route through networks like Lumen (AS3356), NTT (AS2914), or Arelion (AS1299), the packets are almost certainly being label-switched through an MPLS core, even though this is invisible from the outside.
Common deployment scenarios include:
- Service provider backbones — MPLS provides the foundation for all services: internet transit, L3VPN, L2VPN, and mobile backhaul.
- Enterprise WAN — many enterprises purchase MPLS VPN services from providers to connect their branch offices with guaranteed SLAs.
- Mobile networks — MPLS transports traffic between cell towers (eNodeB/gNodeB) and the mobile core, often with strict QoS guarantees.
- Data center interconnect — EVPN-MPLS provides Layer 2 stretch and workload mobility between data centers.
The peering relationships you see when looking up transit providers in the looking glass reflect the edges of these MPLS domains. At Internet Exchange Points, traffic typically transitions from one provider's MPLS domain to another's, with labels being popped at the egress PE before the packet crosses the peering fabric.
Explore the Networks That Run MPLS
While MPLS labels are internal to each provider and not visible in the public routing table, you can explore the autonomous systems that rely on MPLS for their backbone transport. Use the BGP looking glass to examine routes, AS paths, and peering relationships for major transit providers: