How OSPF Works: Link-State Routing and Dijkstra's Algorithm
Open Shortest Path First (OSPF) is a link-state interior gateway protocol (IGP) used to distribute routing information within a single autonomous system. Unlike BGP, which routes between autonomous systems using policy-driven path selection, OSPF computes shortest paths through a network by having every router build a complete topological map of the routing domain and then running Dijkstra's Shortest Path First (SPF) algorithm against that map. OSPF is defined in RFC 2328 (OSPFv2 for IPv4) and RFC 5340 (OSPFv3 for IPv6) and is one of the two dominant IGPs in production networks, alongside IS-IS.
OSPF matters to BGP operators because IGP reachability is a prerequisite for iBGP sessions. If the IGP cannot provide a path between two BGP speakers within an AS, their iBGP session will not come up, and external routes learned via eBGP will not propagate internally. OSPF also feeds next-hop resolution for BGP routes — the BGP next-hop attribute must be resolvable via the IGP for a BGP route to be installed in the forwarding table. In MPLS networks, OSPF extensions (RFC 3630) carry traffic engineering information used by RSVP-TE to compute constrained paths.
Link-State Fundamentals
OSPF is a link-state protocol, which means every router in the OSPF domain independently discovers its neighbors, measures the cost of each link, and floods this information to all other routers. Each router ends up with an identical copy of the Link-State Database (LSDB) — a complete graph of the network topology. The router then runs Dijkstra's SPF algorithm on this graph to compute a shortest-path tree rooted at itself, producing a routing table.
This contrasts with distance-vector protocols like RIP, where routers only share their routing tables with immediate neighbors and have no knowledge of the full topology. Link-state protocols converge faster, avoid count-to-infinity problems, and scale better because each router makes forwarding decisions based on a globally consistent view of the network.
OSPF Packet Types
OSPF uses IP protocol number 89 (not TCP or UDP) and defines five packet types. All OSPF packets share a common 24-byte header containing the version number, packet type, packet length, router ID of the sender, area ID, checksum, and authentication data.
- Type 1 — Hello: Discovers and maintains neighbor relationships. Hello packets are sent periodically (every 10 seconds on broadcast/point-to-point links, every 30 seconds on NBMA networks) to the multicast address 224.0.0.5 (AllSPFRouters). They carry the router's Hello/Dead intervals, area ID, subnet mask, router priority (for DR/BDR election), the list of neighbors the router has heard Hellos from, and option flags. Two routers become neighbors only if they agree on Hello interval, Dead interval, area ID, subnet mask, and authentication.
- Type 2 — Database Description (DBD): Used during initial adjacency formation. Routers exchange summaries of their LSDBs so each side can determine which LSAs the other has that it is missing. DBD packets contain a list of LSA headers (not full LSAs), sequence numbers for the exchange, and master/slave negotiation bits.
- Type 3 — Link-State Request (LSR): After comparing DBD summaries, a router sends LSR packets to request specific LSAs it is missing or that have newer sequence numbers than its local copies.
- Type 4 — Link-State Update (LSU): Carries one or more full LSAs, either in response to an LSR or as part of the flooding process when topology changes occur. LSU packets are acknowledged to ensure reliable delivery.
- Type 5 — Link-State Acknowledgment (LSAck): Acknowledges receipt of LSU packets. OSPF uses explicit acknowledgments (not implicit) to guarantee reliable flooding. A router will retransmit an LSA if it does not receive an acknowledgment within a timeout period.
Neighbor Discovery and Adjacency Formation
OSPF routers go through a well-defined state machine when forming adjacencies with their neighbors. Understanding these states is essential for troubleshooting OSPF problems:
- Down: No Hellos have been received from the neighbor. This is the initial state.
- Init: A Hello has been received from the neighbor, but the neighbor has not yet listed this router in its Hello packet's neighbor list. Communication is one-way.
- 2-Way: Bidirectional communication is established — each router sees its own router ID in the neighbor's Hello. On broadcast and NBMA networks, DR/BDR election occurs at this point. Not all neighbor pairs form full adjacencies on multi-access networks; only adjacencies with the DR and BDR proceed past this state.
- ExStart: The two routers negotiate master/slave roles and initial sequence numbers for the DBD exchange. The router with the higher router ID becomes the master.
- Exchange: Routers exchange DBD packets containing LSA headers. Each router compares received headers against its LSDB to identify missing or outdated LSAs.
- Loading: Routers send LSR packets to request missing LSAs and receive them in LSU packets. The LSDB synchronization process is in progress.
- Full: LSDBs are synchronized. The adjacency is fully operational, and the routers can participate in SPF computation together. This is the steady-state for healthy adjacencies.
On point-to-point links, both routers always form a full adjacency. On broadcast multi-access networks (like Ethernet), OSPF elects a DR and BDR to reduce the number of adjacencies from O(n²) to O(n). Each router on the segment forms an adjacency only with the DR and BDR, not with every other router.
Designated Router and Backup Designated Router
On broadcast and NBMA network segments, OSPF elects a Designated Router (DR) and a Backup Designated Router (BDR) to reduce flooding overhead and the number of adjacencies. Without a DR, a segment with N routers would require N(N-1)/2 adjacencies; with a DR, only N-1 adjacencies are needed (each non-DR/BDR router, called a DROther, forms adjacencies only with the DR and BDR).
The DR serves two purposes: it represents the multi-access network as a pseudonode in the LSDB (via a Type 2 Network LSA), and it is responsible for flooding LSAs to all routers on the segment. When a DROther router has a new LSA to flood, it sends it to the multicast address 224.0.0.6 (AllDRouters), and the DR then refloods it to 224.0.0.5 (AllSPFRouters) so that all other routers on the segment receive it.
The BDR exists for redundancy. If the DR fails, the BDR promotes to DR without requiring a new election among all routers (only a new BDR is elected). This minimizes convergence time after a DR failure.
DR Election Rules
DR/BDR election is based on router priority (configurable, 0-255, default 1) and router ID (tiebreaker). The rules are:
- A router with priority 0 is ineligible to become DR or BDR.
- The router with the highest priority becomes the DR. If priorities are equal, the highest router ID wins.
- DR/BDR election is non-preemptive: if a router with a higher priority joins the network after a DR has already been elected, the existing DR retains its role. The new router will only become DR if the current DR fails.
- The router ID is typically the highest loopback IP address configured on the router, or the highest physical interface IP if no loopback exists. It can also be explicitly configured.
OSPF Areas
OSPF uses a hierarchical area structure to improve scalability. Every OSPF domain has a backbone area (Area 0), and all other areas must connect to Area 0 either directly or via virtual links. This two-level hierarchy limits LSA flooding and SPF computation to within each area, reducing CPU and memory consumption on routers.
Area Types
- Backbone Area (Area 0): The core of the OSPF domain. All inter-area traffic transits Area 0. All ABRs must have at least one interface in Area 0. If a non-backbone area cannot physically connect to Area 0, a virtual link through a transit area can bridge the gap (though virtual links are generally considered a design band-aid, not a permanent solution).
- Regular Area: Receives all LSA types — intra-area (Type 1, 2), inter-area summaries (Type 3, 4), and external routes (Type 5). Routers in a regular area maintain the complete LSDB and have full visibility into routes from all areas and external sources.
- Stub Area: Does not accept Type 5 (external) LSAs. The ABR injects a default route (0.0.0.0/0) into the stub area so that internal routers can still reach external destinations via the ABR. This significantly reduces LSDB size and SPF computation time for routers within the stub area. An ASBR cannot reside in a stub area.
- Totally Stubby Area: A Cisco extension that blocks both Type 5 external LSAs and Type 3 inter-area summary LSAs. The ABR injects only a default route. Routers inside see only intra-area routes and the default, minimizing LSDB size to the absolute minimum.
- Not-So-Stubby Area (NSSA): Defined in RFC 3101. Like a stub area (no Type 5 LSAs), but allows an ASBR within the area to inject external routes using Type 7 NSSA External LSAs. The ABR converts Type 7 LSAs to Type 5 at the area boundary before flooding them into the backbone. This is useful when a branch office needs to redistribute routes from a local routing protocol (e.g., a static route to a local network) but does not need to receive the full external routing table from the rest of the OSPF domain.
Area Border Routers (ABRs)
An ABR is a router with interfaces in multiple OSPF areas, including Area 0. ABRs maintain separate LSDBs for each area they belong to and run independent SPF calculations per area. They generate Type 3 Summary LSAs to advertise inter-area routes: routes learned within one area are summarized and advertised into adjacent areas. This is where OSPF route summarization occurs — an ABR can aggregate multiple specific prefixes into a single summary before advertising it into another area, reducing LSDB size and improving stability.
LSA Types
OSPF uses different LSA types to describe different aspects of the network topology. Understanding LSA types is critical for troubleshooting and design:
- Type 1 — Router LSA: Originated by every OSPF router. Describes the router's interfaces within a specific area, their costs, and the neighbors reachable through each interface. Type 1 LSAs are flooded only within the originating area. The collection of all Type 1 LSAs in an area fully describes the area's internal topology.
- Type 2 — Network LSA: Originated by the DR on broadcast and NBMA segments. Lists all routers attached to the multi-access network. Without this LSA type, the LSDB would have no way to represent a multi-access network as a single entity — each router-to-router connection would appear as a separate point-to-point link, distorting the topology.
- Type 3 — Summary LSA (Network): Originated by ABRs. Advertises inter-area routes (prefixes from one area into another). Despite the name "summary," these are not necessarily aggregated routes — by default, each prefix gets its own Type 3 LSA. Actual summarization requires explicit configuration on the ABR.
- Type 4 — Summary LSA (ASBR): Originated by ABRs. Advertises the reachability of an ASBR in another area. Routers in remote areas need this to know how to reach the ASBR that originated external routes (Type 5 LSAs). The Type 4 LSA provides the metric to the ASBR, allowing correct external route computation.
- Type 5 — AS External LSA: Originated by ASBRs. Advertises routes redistributed from external sources (other routing protocols, static routes, connected routes). Type 5 LSAs are flooded throughout the entire OSPF domain (except stub and NSSA areas). They come in two subtypes: E1 (metric includes internal OSPF cost to the ASBR) and E2 (metric is the external metric only, ignoring internal cost; E2 is the default).
- Type 7 — NSSA External LSA: Similar to Type 5 but used within NSSA areas. Originated by ASBRs within an NSSA. The ABR converts Type 7 to Type 5 at the NSSA boundary. This conversion is known as the 7-to-5 translation, and the ABR doing the translation sets the forwarding address in the Type 5 LSA to the ASBR's address so that traffic takes the optimal path.
Dijkstra's SPF Algorithm in OSPF
At the heart of OSPF is Dijkstra's SPF algorithm, which computes the shortest-path tree from the running router to every other router and network in the area. The algorithm works as follows:
- Initialize the SPF tree with only the root (this router). Place it on the candidate list (also called the OPEN set).
- Move the root to the shortest-path tree (CLOSED set). Examine all links described in the root's Type 1 Router LSA. For each neighbor, calculate the cumulative cost from the root. Add each neighbor to the candidate list with its computed cost.
- From the candidate list, select the vertex with the lowest cumulative cost. Move it to the shortest-path tree.
- For the newly added vertex, examine its LSA to discover its neighbors. For each neighbor not yet in the shortest-path tree, compute the cumulative cost via this vertex. If the neighbor is not in the candidate list, add it. If it is already in the candidate list with a higher cost, update its cost and parent pointer.
- Repeat steps 3-4 until the candidate list is empty.
The resulting tree gives the shortest path from this router to every other router and network prefix in the area. The next hop for each destination is determined by tracing back through the tree to find the first hop along each shortest path.
OSPF uses interface cost as the metric, typically calculated as the reference bandwidth divided by the interface bandwidth (the default reference bandwidth is 100 Mbps, so a 100 Mbps link has cost 1, a 10 Mbps link has cost 10, and a 1 Gbps link also has cost 1 unless the reference bandwidth is increased). Administrators should adjust the reference bandwidth to differentiate between modern high-speed links (e.g., setting it to 100 Gbps so that 10G links have cost 10 and 100G links have cost 1).
OSPF Route Types and Preference
OSPF routes have a strict preference order. When multiple routes to the same destination exist, OSPF selects based on route type (not just metric):
- Intra-area routes (O): Routes within the same area. Always preferred over inter-area routes, regardless of metric.
- Inter-area routes (O IA): Routes from other areas, learned via Type 3 Summary LSAs. Preferred over external routes.
- External Type 1 routes (O E1): External routes where the metric includes the internal OSPF cost to the ASBR plus the external metric.
- External Type 2 routes (O E2): External routes where only the external metric is considered (default type). If two E2 routes have equal external metric, the route with the lower internal cost to the ASBR is preferred.
- NSSA Type 1 routes (O N1): Like E1 but within an NSSA.
- NSSA Type 2 routes (O N2): Like E2 but within an NSSA.
This hierarchy means an intra-area route with a metric of 1000 is preferred over an inter-area route with a metric of 1. This design prevents suboptimal routing through external paths and ensures area boundaries are respected in path selection.
OSPF Authentication
OSPF supports three authentication modes to prevent unauthorized routers from injecting routes:
- Null (Type 0): No authentication. The default, and obviously insecure.
- Simple Password (Type 1): A plaintext password included in OSPF packets. Provides minimal security since the password can be captured with a packet sniffer.
- MD5 Cryptographic (Type 2): An HMAC-MD5 hash of the OSPF packet and a shared key. The key itself is never transmitted. This is the most widely used authentication method, though MD5 is cryptographically weak by modern standards.
RFC 5709 introduced SHA-based HMAC authentication for OSPF, providing stronger cryptographic protection. In practice, many networks still use MD5 because the threat model for IGP authentication is typically limited to on-link attackers (someone with physical or Layer 2 access to the network segment), and MD5 is sufficient for this use case.
OSPF Equal-Cost Multi-Path (ECMP)
When Dijkstra's algorithm discovers multiple paths to the same destination with equal cost, OSPF can install all of them in the routing table for load balancing. This is called Equal-Cost Multi-Path (ECMP). Most router implementations support up to 4, 8, 16, or even 64 equal-cost next hops per destination.
ECMP traffic distribution is typically per-flow (based on a hash of source/destination IP, protocol, and port numbers) rather than per-packet, to avoid out-of-order delivery. ECMP is one of OSPF's significant advantages over simple static routing, as it automatically discovers and utilizes parallel paths.
OSPFv2 vs. OSPFv3
OSPFv2 (RFC 2328) was designed for IPv4 and carries IPv4 prefixes. OSPFv3 (RFC 5340) was originally designed for IPv6 but has been extended (RFC 5838) to support multiple address families, including IPv4. Key differences:
- Addressing: OSPFv2 identifies neighbors by their interface IP address. OSPFv3 identifies neighbors by their router ID (a 32-bit number) and runs per-link rather than per-subnet, so multiple instances can run on the same link.
- LSA format: OSPFv3 removes address information from Router and Network LSAs (Type 1 and 2). Instead, prefixes are advertised in new LSA types: Intra-Area Prefix LSA (Type 9) and Link LSA (Type 8). This decoupling of topology from addressing makes OSPFv3 more flexible.
- Flooding scope: OSPFv3 introduces a flooding scope field in the LSA header, supporting link-local, area, and AS-wide flooding scopes. OSPFv2 determines flooding scope implicitly from the LSA type.
- Authentication: OSPFv3 does not include built-in authentication. Instead, it relies on IPsec (AH or ESP) for authentication and confidentiality, as defined in RFC 4552. This simplifies the protocol but requires IPsec infrastructure.
- Link-local addresses: OSPFv3 uses IPv6 link-local addresses for neighbor discovery and Hello packets, not global or unique-local addresses. This is a natural fit for IPv6's link-local addressing model.
OSPF in Service Provider Networks
In large ISP networks, OSPF serves as the IGP that provides reachability between iBGP speakers and MPLS LSRs. The typical design pattern is:
- OSPF (or IS-IS) runs in the backbone to advertise loopback addresses of all PE and P routers.
- iBGP sessions are established between PE router loopbacks, with the IGP providing next-hop resolution.
- LDP or Segment Routing labels are distributed based on IGP reachability.
- Only infrastructure prefixes (loopbacks and point-to-point links) are advertised in the IGP — customer routes are carried entirely in BGP.
Keeping the IGP lean (tens or hundreds of prefixes rather than thousands) is a best practice that ensures fast convergence. When the IGP carries too many routes, SPF computation time increases, and convergence after a link failure slows down.
OSPF Traffic Engineering Extensions
RFC 3630 defines OSPF-TE (Traffic Engineering) extensions that allow OSPF to carry link attributes needed for MPLS RSVP-TE path computation. These attributes are carried in Opaque LSAs (Type 10) and include:
- Maximum bandwidth of a link
- Maximum reservable bandwidth
- Unreserved bandwidth at each of 8 priority levels
- Administrative group (link coloring for affinity-based path constraints)
- TE metric (can differ from the IGP metric)
- SRLG (Shared Risk Link Group) membership
This information allows the head-end router of an RSVP-TE tunnel to run a Constrained Shortest Path First (CSPF) computation that considers not just link metrics but also available bandwidth and administrative constraints. OSPF-TE is being gradually superseded by Segment Routing extensions, which provide similar traffic engineering capabilities without the per-tunnel state overhead of RSVP-TE.
OSPF Convergence and Tuning
OSPF convergence — the time from a topology change to the routing table being updated — involves several components:
- Failure detection: How quickly a link or neighbor failure is detected. Options include waiting for the Dead interval (default 40 seconds, far too slow for production), using Bidirectional Forwarding Detection (BFD, sub-second detection), or relying on physical layer loss-of-signal indications.
- LSA generation: After detecting a failure, the affected router generates a new LSA. Most implementations apply an initial delay (e.g., 50ms) to batch multiple simultaneous changes.
- LSA flooding: The LSA must be reliably flooded to all routers in the area. In a well-connected area, flooding completes in a few milliseconds per hop.
- SPF computation: Each router receiving the LSA must re-run SPF. Modern routers apply SPF throttling (initial delay, hold time, maximum wait time) to prevent CPU spikes from multiple rapid topology changes. A typical configuration is 50ms initial, 200ms hold, 5s maximum.
- RIB/FIB update: The computed routes must be installed in the routing table and programmed into the forwarding hardware.
With BFD for fast failure detection, aggressive SPF timers, and modern hardware, OSPF convergence can be achieved in under 100 milliseconds for typical topologies. However, in very large areas with thousands of prefixes, the SPF computation itself can take significant time, which is one reason for careful area design and keeping the IGP topology small.
OSPF Graceful Restart and Non-Stop Routing
RFC 3623 defines OSPF Graceful Restart (GR), which allows a router to restart its OSPF process (e.g., for a software upgrade) without causing its neighbors to treat it as failed. During a graceful restart, neighboring routers continue to forward traffic to the restarting router using existing routes, and the restarting router has a grace period to re-establish adjacencies and synchronize its LSDB without triggering network-wide SPF recalculations.
Non-Stop Routing (NSR) takes a different approach: the standby supervisor maintains a synchronized copy of the OSPF state, so when the active supervisor fails, the standby takes over without any protocol disruption visible to neighbors. NSR is platform-specific and does not require protocol-level support from neighbors, unlike GR which requires helper mode on neighboring routers.
Common OSPF Design Mistakes
Several design errors frequently cause OSPF operational problems:
- Area 0 fragmentation: If Area 0 becomes partitioned (e.g., due to a link failure), inter-area routing breaks because traffic between non-backbone areas must transit Area 0. Virtual links can temporarily bridge partitions but are not a scalable solution.
- Excessive area size: Placing thousands of routers and prefixes in a single area negates OSPF's hierarchical design benefits. SPF computation becomes expensive, and every link state change triggers a full SPF run across all routers in the area.
- Mutual redistribution: Redistributing routes between OSPF and another protocol (e.g., EIGRP) in both directions without careful filtering creates routing loops and instability. Using route tags and distribute lists to prevent feedback is essential.
- Mismatched parameters: Neighbors will not form adjacencies if Hello/Dead timers, area IDs, authentication credentials, or network types are mismatched. This is the most common cause of OSPF adjacency failures.
- Forgetting the reference bandwidth: With the default reference bandwidth of 100 Mbps, all links from 100 Mbps to 100 Gbps have the same OSPF cost of 1, making OSPF unable to prefer faster paths. Adjusting the reference bandwidth is essential in modern networks.
OSPF and BGP Interaction
OSPF and BGP serve complementary roles. OSPF handles internal routing within an autonomous system, while BGP handles routing between autonomous systems. Their interaction follows a clear pattern:
- OSPF advertises the loopback addresses used as BGP router IDs and iBGP session endpoints.
- BGP next-hop resolution uses the OSPF-computed shortest path to the BGP next-hop address.
- External routes learned via eBGP are not redistributed into OSPF in well-designed networks. The full internet routing table (over 1 million prefixes) would overwhelm the OSPF LSDB and SPF computation. Instead, BGP carries external routes, and OSPF only carries the infrastructure prefixes needed for BGP and MPLS to function.
- When a BGP next-hop becomes unreachable due to an IGP topology change, the BGP routes through that next-hop are withdrawn or reconverged. This is why IGP convergence speed directly affects BGP convergence.
Explore OSPF-Carried BGP Routes
OSPF provides the foundation that makes BGP work within an AS. To see real BGP routing data — the AS paths, prefixes, and origin ASes that BGP carries on top of the IGP underlay — try the god.ad BGP Looking Glass. Look up any IP address or ASN to see live BGP routes and understand how inter-domain routing builds on top of the intra-domain routing that OSPF provides.