How Spanning Tree Protocol Works: Preventing Layer 2 Loops

Spanning Tree Protocol (STP) is a Layer 2 network protocol that prevents forwarding loops in Ethernet networks with redundant paths. When switches are connected in a mesh or ring topology for fault tolerance, the redundant links create the possibility of broadcast storms, MAC address table instability, and duplicate frame delivery. STP solves this by logically blocking redundant ports so that the active topology forms a loop-free spanning tree, while keeping blocked links available for failover if an active link fails. Without STP, a single Ethernet broadcast frame entering a loop would circulate indefinitely, replicating at each switch, and consume all available bandwidth within seconds.

The original Spanning Tree Protocol was defined in IEEE 802.1D, published in 1990 and based on an algorithm invented by Radia Perlman in 1985. Her solution was elegant: elect a single root bridge, compute shortest paths from every switch to the root, and disable all other links. The protocol has been revised multiple times since then — Rapid Spanning Tree Protocol (RSTP, 802.1w) drastically improved convergence time, and Multiple Spanning Tree Protocol (MSTP, 802.1s) added VLAN-aware spanning trees — but the fundamental principle remains the same. STP is one of the most critical yet least visible protocols in enterprise and data center networking.

Why Layer 2 Loops Are Catastrophic

Unlike Layer 3, where IP packets have a TTL (Time to Live) field that is decremented at each hop and causes the packet to be discarded when it reaches zero, Ethernet frames have no TTL equivalent. Once a frame enters a Layer 2 loop, nothing stops it. It circulates forever, and the situation compounds exponentially:

A single miscabled patch cord creating a physical loop in an access closet can take down an entire Layer 2 domain — every VLAN that spans the loop is affected. Network engineers sometimes call this "the network melting." STP exists to ensure this cannot happen, even when redundant links are intentionally provisioned for resilience.

STP Fundamentals: The 802.1D Algorithm

The original STP algorithm works by electing a root bridge, then computing a loop-free tree topology rooted at that bridge. The algorithm proceeds in three phases:

  1. Root bridge election — All switches exchange Bridge Protocol Data Units (BPDUs). The switch with the lowest Bridge ID becomes the root bridge.
  2. Root port selection — Every non-root switch selects the port with the lowest-cost path to the root bridge as its root port. This port forwards traffic toward the root.
  3. Designated port selection — On each network segment (link between switches), the switch that offers the lowest-cost path to the root bridge wins the role of designated bridge for that segment. Its port on that segment becomes the designated port. All other ports on that segment are placed in a blocking state.

The result is a tree: every switch has exactly one path to the root bridge, broadcast traffic flows down the tree and back up without looping, and redundant links sit in blocking state, ready to activate if the primary topology fails.

Bridge ID and Bridge Priority

The Bridge ID is the value used to elect the root bridge. It is an 8-byte value composed of two parts:

The switch with the numerically lowest Bridge ID wins the root election. Since priority is compared first, a network administrator can deterministically control which switch becomes root by setting its priority to a low value (e.g., 4096 or 8192). If priorities are equal, the switch with the lowest MAC address wins — but relying on MAC addresses for root election is poor practice because it removes administrative control.

BPDU Format

All STP communication occurs through Bridge Protocol Data Units (BPDUs). Switches send BPDUs as Ethernet frames to the well-known multicast destination address 01:80:C2:00:00:00. This address is reserved by IEEE and is never forwarded by compliant switches, meaning BPDUs are processed locally at each hop and never traverse a link beyond the immediate neighbor.

Configuration BPDU (802.1D) — 35 bytes Protocol ID 2 bytes (0x0000) Version 1 byte (0x00) BPDU Type 1 byte Flags 1 byte Root Bridge ID 8 bytes (priority 2B + MAC 6B) Root Path Cost 4 bytes Sender Bridge ID 8 bytes (priority 2B + MAC 6B) Port ID 2 bytes Message Age 2 bytes Max Age 2 bytes (def: 20s) Hello Time 2 bytes (def: 2s) Forward Delay 2 bytes (def: 15s) BPDU Type values: 0x00 = Configuration BPDU (sent by designated ports) 0x80 = Topology Change Notification (TCN) BPDU 0x02 = RSTP BPDU (802.1w, version 2) Flags byte: Bit 0 = Topology Change (TC) Bit 7 = Topology Change Acknowledgment (TCA) Bits 1-6 = Reserved (used by RSTP)

There are two types of BPDUs in classic STP:

Path Cost Calculation

Each switch port has an associated STP path cost that reflects the bandwidth of the link. The root path cost for a port is the cumulative cost to reach the root bridge through that port. When a switch receives a BPDU on a port, it adds that port's cost to the Root Path Cost advertised in the BPDU to determine the total cost to reach the root through that path.

The IEEE has defined two cost scales. The original 802.1D short costs (16-bit values) were designed when 10 Mbps was common. The revised long costs (32-bit values) were introduced to accommodate 10 Gbps and faster links:

Link Speed Short Cost (802.1D-1998) Long Cost (802.1D-2004)
10 Mbps 100 2,000,000
100 Mbps 19 200,000
1 Gbps 4 20,000
10 Gbps 2 2,000
100 Gbps 1 200
400 Gbps 1 50

The problem with the short cost scale is obvious: both 100 Gbps and 400 Gbps links have a cost of 1, making it impossible for STP to prefer the faster link. The long cost scale provides much better granularity and should be used in modern deployments.

Port States in 802.1D STP

Classic STP defines five port states. A port transitions through these states as STP converges on a loop-free topology:

The critical implication of these transitional states is that when a topology change occurs, there is a minimum 30-second delay (listening + learning) before a newly active port begins forwarding traffic. Combined with the Max Age timer (default 20 seconds, the time a switch waits before declaring a BPDU stale and triggering reconvergence), the worst-case convergence time for classic 802.1D STP is approximately 50 seconds — 20 seconds of Max Age plus 30 seconds of forward delay. For many modern applications, this is unacceptably slow.

Root Bridge Election in Detail

When switches first boot, each switch assumes it is the root bridge and begins sending Configuration BPDUs advertising itself as root (with a root path cost of 0). When a switch receives a BPDU advertising a lower Bridge ID than its own, it recognizes that it is not the root and stops generating its own root-claim BPDUs. Instead, it begins relaying the superior BPDU (with updated path cost and sender information) on its designated ports.

The election converges when all switches agree on the same root bridge. The tiebreaking hierarchy for determining the superior BPDU is:

  1. Lowest Root Bridge ID — The BPDU advertising the switch with the lowest Bridge ID (priority + MAC) as root wins.
  2. Lowest Root Path Cost — If two BPDUs advertise the same root, the one with the lower cumulative path cost wins.
  3. Lowest Sender Bridge ID — If the root and cost are equal, the BPDU from the switch with the lower Bridge ID wins.
  4. Lowest Sender Port ID — If even the sender is the same (multiple links between two switches), the BPDU from the lower port ID wins.

This deterministic hierarchy ensures that the topology converges to a single, consistent spanning tree. Network administrators should always explicitly configure root bridge priority rather than leaving it to the default, because the default priority of 32768 is the same on all switches, causing the root election to be decided by MAC address — which is effectively random and may place the root on a low-performance access switch rather than a high-capacity core switch.

STP Port Roles

Once the root bridge is elected and path costs are computed, every port on every switch is assigned one of the following roles:

STP Topology — Port Roles & States Switch A Root Bridge (pri: 4096) Switch B pri: 32768 Switch C pri: 32768 DP Fwd RP Fwd cost: 4 DP Fwd RP Fwd cost: 4 DP Fwd BK Blocking cost: 4 DP = Designated Port RP = Root Port BK = Blocked Active Blocked

In this example, Switch A is the root bridge (lowest priority). Both Switch B and Switch C select their uplink to Switch A as their root port. On the B-C segment, Switch B is the designated bridge because it has a lower Bridge ID (assuming a lower MAC address), so its port toward C is a designated port. Switch C's port toward B is neither a root port nor a designated port, so it enters the Blocking state. The B-C link is logically disabled, eliminating the potential loop while remaining available for failover.

STP Timers

STP convergence behavior is governed by three timers, all set on the root bridge and propagated in BPDUs to all other switches:

These conservative defaults mean worst-case convergence is roughly 50 seconds (Max Age + 2 x Forward Delay). While it is possible to tune these values, reducing them too aggressively increases the risk of temporary loops during convergence — the timers exist specifically to give all switches in the network time to receive and process the new topology information before any port begins forwarding.

Topology Change Process

When a switch detects a topology change (a port moving from Forwarding to Blocking or a port moving to Forwarding), it initiates the following process:

  1. The detecting switch sends a TCN BPDU out its root port toward the root bridge.
  2. The upstream switch acknowledges receipt by setting the TCA (Topology Change Acknowledgment) flag in its next Configuration BPDU back to the sender.
  3. The upstream switch also relays the TCN out its own root port, hop by hop, until it reaches the root bridge.
  4. The root bridge sets the TC (Topology Change) flag in its Configuration BPDUs for a period of Max Age + Forward Delay seconds (default 35 seconds).
  5. All switches receiving BPDUs with the TC flag reduce their MAC address table aging time from the normal 300 seconds to Forward Delay (15 seconds). This causes stale MAC entries to age out quickly, forcing the switch to re-learn MAC address locations through flooding.

The MAC table flush on topology change is one of the most disruptive aspects of STP. Even a single port flap on a distant access switch triggers a network-wide MAC table aging reduction, causing temporary flooding on every switch. This is why STP edge port features (PortFast) are important — they prevent end-device port transitions from triggering topology changes.

Rapid Spanning Tree Protocol (RSTP) — 802.1w

RSTP was standardized in 2001 as IEEE 802.1w and later incorporated into the 802.1D-2004 revision, effectively replacing the original STP. RSTP addresses the fundamental flaw of classic STP: its slow convergence. RSTP achieves subsecond failover in most topologies through several key improvements:

New Port Roles

RSTP introduces two new port roles alongside the existing Root Port and Designated Port:

Simplified Port States

RSTP collapses the five 802.1D port states into three:

The elimination of the separate Listening state is possible because RSTP uses a proposal/agreement mechanism rather than timer-based transitions.

Proposal/Agreement Mechanism

The key innovation of RSTP is its proposal/agreement handshake, which allows ports to transition directly to Forwarding without waiting through Forward Delay timers. When a designated port wants to move to Forwarding, it sends a BPDU with the Proposal flag set. The downstream switch, upon receiving this proposal, blocks all its other non-edge ports (to prevent temporary loops), confirms that its topology is safe, and sends back a BPDU with the Agreement flag. Upon receiving the agreement, the designated port immediately transitions to Forwarding.

This handshake propagates through the network in a wave: each switch proposes to its downstream neighbors, which in turn block and agree, then propose to their own downstream neighbors. The entire network can converge in a fraction of a second because each handshake takes only one round-trip time between adjacent switches, rather than 30 seconds of timer-based delays.

Edge Ports

RSTP formalizes the concept of an edge port — a port connected to an end device (host, server, printer) rather than another switch. Edge ports transition immediately to Forwarding without any proposal/agreement or timer delays, because they cannot create loops (there is no switch on the other end to form a loop with). If an edge port ever receives a BPDU, it automatically loses its edge status and participates in normal RSTP operation. This is the standardized equivalent of Cisco's proprietary PortFast feature.

Active Topology Maintenance

In classic STP, only the root bridge originates BPDUs, and other switches relay them. If a non-root switch stops receiving BPDUs, it must wait for Max Age before acting. In RSTP, every switch generates its own BPDUs on its designated ports every Hello Time (2 seconds). If a switch misses three consecutive BPDUs from its neighbor (6 seconds), it considers the neighbor lost and immediately begins reconvergence. This eliminates the 20-second Max Age delay.

Multiple Spanning Tree Protocol (MSTP) — 802.1s

In networks with multiple VLANs, running a single spanning tree instance for all VLANs is suboptimal. Some links may be blocked that could be carrying traffic for certain VLANs, wasting bandwidth. Cisco's proprietary Per-VLAN Spanning Tree Plus (PVST+) runs a separate STP instance for each VLAN, allowing different VLANs to use different active topologies. However, with hundreds of VLANs, running hundreds of independent STP instances creates significant CPU and BPDU overhead on every switch.

MSTP, defined in IEEE 802.1s (later merged into 802.1Q-2005), provides a middle ground. MSTP maps multiple VLANs to a smaller number of spanning tree instances (MSTIs). For example, VLANs 1-100 might map to instance 1, and VLANs 101-200 to instance 2. Each instance computes its own independent spanning tree, so different groups of VLANs can use different active topologies. This provides load balancing without the overhead of per-VLAN STP.

MSTP organizes the network into MST regions. Switches within the same region must share the same configuration:

Within a region, MSTP runs multiple spanning tree instances, each with its own root bridge, port roles, and port states. Between regions, MSTP presents the entire region as a single virtual bridge to the Common and Internal Spanning Tree (CIST), which provides loop-free connectivity across regions. This hierarchical design keeps STP complexity local while maintaining global loop freedom.

STP Security Attacks

STP operates on trust — switches accept BPDUs from any device without authentication. This makes STP vulnerable to several attacks, particularly from devices connected to access ports:

BPDU Spoofing (Root Bridge Attack)

An attacker connects a device to the network and sends crafted BPDUs advertising a Bridge ID lower than the current root bridge. All switches accept this superior BPDU and recalculate their spanning tree, electing the attacker's device as the new root bridge. The consequences are severe:

BPDU Flooding

An attacker sends a large volume of BPDUs with varying root bridge IDs and parameters, forcing all switches to continuously recompute their spanning tree. This consumes switch CPU and causes repeated topology changes, effectively creating a denial-of-service condition across the entire Layer 2 domain. Even if no single BPDU claims root status, the volume of STP computation can overwhelm switch control planes.

Topology Change Attack

An attacker sends TCN BPDUs or Configuration BPDUs with the TC flag set, causing all switches to reduce their MAC address table aging time. This forces continuous MAC table flushing and flooding, degrading network performance and potentially enabling eavesdropping on traffic that would normally be switched to a specific port.

STP Security Protections

Managed switches provide several features to defend against STP attacks. These should be enabled on every access port:

BPDU Guard

BPDU Guard is the most important STP security feature. When enabled on a port, it immediately places the port in an error-disabled (err-disabled) state if a BPDU is received. This completely prevents any device connected to that port from participating in STP. BPDU Guard should be enabled on all access ports — ports connected to end devices like computers, servers, and printers that should never send BPDUs. A legitimate end device never sends BPDUs; receiving one indicates either a misconfigured switch, a bridging-mode device, or an attack.

Typical configuration (Cisco IOS):

interface GigabitEthernet0/1
  switchport mode access
  switchport access vlan 10
  spanning-tree portfast
  spanning-tree bpduguard enable

When a port is error-disabled by BPDU Guard, it requires manual intervention (shutdown / no shutdown) or automatic recovery (via errdisable recovery cause bpduguard) to return to service. This ensures that administrators are alerted to the issue.

Root Guard

Root Guard prevents a port from ever becoming a root port. If a BPDU with a superior Bridge ID is received on a Root Guard-enabled port, the port is placed in a root-inconsistent state (effectively blocking) rather than accepting the new root and reconverging. The port automatically recovers when it stops receiving the superior BPDUs.

Root Guard is used on designated ports facing downstream switches that should never become the path to the root bridge. For example, core switches facing distribution switches might enable Root Guard to ensure the root bridge always stays in the core, regardless of what happens at the access layer.

BPDU Filter

BPDU Filter suppresses BPDU transmission and reception on a port. When enabled globally (in conjunction with PortFast), it prevents edge ports from sending BPDUs and ignores any BPDUs received — but if a BPDU is received, the port loses its PortFast status and participates in STP normally. When enabled per-interface, it unconditionally drops all BPDUs in both directions, which is dangerous because it can create loops if misapplied. BPDU Filter should be used with extreme caution and typically only in specific scenarios like service provider environments.

Loop Guard

Loop Guard protects against loops caused by unidirectional link failures. If a non-designated port (root port or alternate port) stops receiving BPDUs (due to a fiber strand failure, a software bug, or a misconfigured BPDU filter), classic STP would eventually expire the Max Age timer and transition the port to Forwarding — potentially creating a loop. Loop Guard prevents this by placing the port in a loop-inconsistent state instead of transitioning to Forwarding when BPDUs stop arriving. The port automatically recovers when BPDUs resume.

PortFast and Its Importance

Without PortFast (or RSTP edge port configuration), a newly connected end device must wait through the Listening (15s) and Learning (15s) states — a total of 30 seconds — before its port begins forwarding traffic. This delay is noticeable to users: DHCP timeouts, PXE boot failures, and applications failing to connect until the port transitions to Forwarding.

PortFast (Cisco) or edge port (RSTP) causes the port to transition immediately to Forwarding when the link comes up, bypassing the Listening and Learning states. It also prevents the port from generating Topology Change Notifications when it goes up or down — important because end devices frequently power on, reboot, or disconnect their cables, and these events should not trigger network-wide MAC table flushes.

PortFast should only be configured on access ports connected to end devices. Enabling it on inter-switch links creates the risk of temporary loops during convergence. PortFast is almost always paired with BPDU Guard: PortFast enables fast transitions, and BPDU Guard ensures the port is shut down if a switch is accidentally connected.

STP in Modern Data Centers

Traditional STP has significant limitations in modern data center environments. Blocking redundant links to prevent loops wastes half the available bandwidth in a typical dual-uplink topology. This has driven the adoption of alternatives:

Despite these alternatives, STP remains essential in enterprise campus networks, branch offices, and anywhere Layer 2 switching is used with physical redundancy. Even in data centers using VXLAN fabrics, STP typically runs on access-layer VLAN segments within each rack as a safety net.

Troubleshooting STP Issues

Common STP problems and their symptoms:

STP and Ethernet Evolution

STP was designed for a world of shared-media Ethernet hubs and early Layer 2 switches, where redundancy required physical loops. The protocol has evolved through several generations:

While the trend in data center networking is toward Layer 3 fabrics and VXLAN overlays that eliminate the need for STP, the protocol remains foundational to understanding Layer 2 networking. Any network with redundant Ethernet links and a flat Layer 2 domain — which describes most enterprise campus and branch networks — depends on STP to prevent catastrophic loops. Understanding how STP works, how to tune it, and how to secure it is essential knowledge for any network engineer.

Explore Layer 2 and Layer 3 Networking

STP ensures a stable Layer 2 foundation, but the internet operates at Layer 3 and above. You can explore how IP routing and BGP interconnect autonomous systems, examine the subnet structure of IP address space, and trace how traffic flows across the global routing table:

Try looking up any IP address on the god.ad BGP Looking Glass to see which network originates the prefix, the full AS path from your network to the destination, and how Layer 3 routing decisions shape traffic flow on top of the switched Layer 2 infrastructure that STP keeps loop-free.

See BGP routing data in real time

Open Looking Glass
More Articles
What is DNS? The Internet's Phone Book
What is an IP Address?
IPv4 vs IPv6: What's the Difference?
What is a Network Prefix (CIDR)?
How Does Traceroute Work?
What is a CDN? Content Delivery Networks Explained