How SNMP Works: Network Monitoring with MIBs, OIDs, and Traps

Simple Network Management Protocol (SNMP) is the dominant protocol for monitoring and managing network devices -- routers, switches, firewalls, servers, and virtually any IP-connected equipment. Defined originally in RFC 1157 (1990), SNMP provides a standardized framework for querying device status, collecting performance metrics, and receiving asynchronous alerts when something goes wrong. Despite its name, SNMP is anything but simple in practice: it encompasses a hierarchical data model (the Management Information Base), a tree of Object Identifiers numbering in the hundreds of thousands, and three major protocol versions with drastically different security properties.

At its core, SNMP follows a manager-agent architecture. The SNMP manager (a monitoring server running software like Nagios, Zabbix, LibreNMS, or PRTG) polls agents running on managed devices. Each agent exposes a local view of the device's state -- interface counters, CPU utilization, memory usage, routing table entries, BGP peer states, and thousands of other variables -- through a structured namespace that any SNMP-speaking manager can query. Agents can also send unsolicited notifications (traps or informs) when predefined thresholds are crossed or events occur.

SNMP operates over UDP, using port 161 for queries and port 162 for traps. The choice of UDP is deliberate: monitoring traffic should not congest the control plane with TCP retransmissions during network failures -- precisely the time when monitoring data is most critical. If a device is so overloaded that it drops an SNMP poll, the manager simply records a timeout and tries again next cycle. This stateless, fire-and-forget approach keeps SNMP lightweight enough to run on devices with minimal CPU and memory, from embedded switches to multi-chassis routers.

The Management Information Base (MIB)

The MIB is the data model that defines what an SNMP agent exposes. It is not a database in the traditional sense -- it is a schema that describes the variables (called "managed objects") an agent can report, their data types, whether they are read-only or read-write, and their position in a global hierarchical namespace. MIBs are written in a formal notation called Structure of Management Information (SMI), defined in RFCs 2578-2580 for SMIv2.

Each managed object has a unique Object Identifier (OID) -- a sequence of integers that traces a path through a global tree. The tree is rooted at a conceptual top node and branches through organizational assignments maintained by ISO, ITU-T, and IANA. The most commonly encountered branch is the internet subtree at 1.3.6.1, under which standard MIBs and vendor-specific (enterprise) MIBs live.

Key OID prefixes every network engineer encounters:

1.3.6.1.2.1 -- mgmt: the standard MIB-II objects defined in RFC 1213. This is where interface counters (ifTable), IP statistics, TCP/UDP connection tables, and system identification live.
1.3.6.1.2.1.1 -- system: basic device information -- sysDescr (1.3.6.1.2.1.1.1), sysObjectID (1.3.6.1.2.1.1.2), sysUpTime (1.3.6.1.2.1.1.3), sysContact, sysName, sysLocation.
1.3.6.1.2.1.2 -- interfaces: the ifTable, containing ifInOctets, ifOutOctets, ifSpeed, ifOperStatus, ifAdminStatus, and error/discard counters for every interface on the device.
1.3.6.1.2.1.15 -- bgp4-MIB: BGP peer state, established time, prefix counts, and AS numbers. Defined in RFC 4273.
1.3.6.1.4.1 -- enterprises: vendor-specific MIBs. Each vendor has an IANA-assigned enterprise number (e.g., Cisco is 9, Juniper is 2636, Arista is 30065). Under their enterprise branch, vendors define proprietary objects for features not covered by standard MIBs.

A single managed object might be a scalar (one value per device, like sysUpTime) or a column in a table. The ifTable is the canonical example: it has one row per interface, and each row contains columns for the interface index, description, type, speed, operational status, and traffic counters. To walk the entire ifTable, a manager issues successive GETNEXT requests (or a GETBULK in SNMPv2c/v3) starting at the table's OID prefix, and the agent returns each row's values in lexicographic order of their OIDs.

MIB files are distributed by vendors as text files in ASN.1 notation. A monitoring system loads these MIB files to translate raw numeric OIDs into human-readable names. Without the MIB loaded, you see 1.3.6.1.2.1.1.3.0; with it, you see sysUpTime.0. The ".0" suffix indicates a scalar instance (as opposed to a table row indexed by interface number or other key).

SNMP Protocol Operations

SNMP defines a small set of protocol data units (PDUs) for communication between managers and agents. Each PDU is encoded using Basic Encoding Rules (BER) of ASN.1 -- a compact binary encoding that packs type-length-value (TLV) structures into byte sequences. This encoding is one reason SNMP is efficient on the wire but painful to debug without a tool like snmpwalk or Wireshark.

GET

The GetRequest PDU asks the agent for the value of one or more specific OIDs. The agent responds with a GetResponse containing the requested values. If an OID does not exist, the agent returns a noSuchObject or noSuchInstance error (in SNMPv2c/v3) rather than silently ignoring it.

# Query system uptime and hostname
$ snmpget -v2c -c public 10.0.0.1 sysUpTime.0 sysName.0
SNMPv2-MIB::sysUpTime.0 = Timeticks: (239482100) 27 days, 17:13:41.00
SNMPv2-MIB::sysName.0 = STRING: core-rtr-01.example.net

GETNEXT

The GetNextRequest returns the value of the next OID in lexicographic order after the specified OID. This is the mechanism that enables MIB walking: the manager starts at a subtree root, calls GETNEXT repeatedly, and the agent returns each successive object until the walk exits the subtree. The famous snmpwalk command is simply an automated loop of GETNEXT requests.

# Walk the entire interfaces table
$ snmpwalk -v2c -c public 10.0.0.1 ifDescr
IF-MIB::ifDescr.1 = STRING: lo0
IF-MIB::ifDescr.2 = STRING: ge-0/0/0
IF-MIB::ifDescr.3 = STRING: ge-0/0/1
IF-MIB::ifDescr.4 = STRING: ae0

GETBULK (SNMPv2c/v3)

Walking a large table with GETNEXT is slow -- one round trip per OID. SNMPv2c introduced GetBulkRequest, which tells the agent to return multiple successive OIDs in a single response. The PDU includes two parameters: non-repeaters (the number of leading OIDs to treat as simple GET operations) and max-repetitions (how many successive OIDs to return for the remaining variables). A typical GETBULK for walking the ifTable might request 50 repetitions, returning 50 rows per round trip instead of one.

SET

The SetRequest PDU writes a value to a writable OID. SET is how SNMP can be used for configuration management, not just monitoring -- you can administratively disable an interface by setting ifAdminStatus to down(2), change a device's hostname via sysName, or modify VLAN assignments. In practice, SET is used cautiously because it provides no transactional guarantees, no rollback, and limited error reporting. Most operators prefer CLI, NETCONF, or gRPC for configuration changes and restrict SNMP to read-only monitoring.

Traps and Informs

Traps are unsolicited notifications sent from an agent to a manager when a significant event occurs -- an interface goes down, a BGP peer drops, a fan fails, memory exceeds a threshold. Traps are fire-and-forget: the agent sends a UDP packet to the configured trap receiver and does not wait for acknowledgment. If the trap is lost due to network congestion, a full UDP buffer, or the receiver being offline, the event goes unreported.

SNMPv2c introduced Informs as a reliable alternative. An Inform is functionally identical to a trap in content, but the receiver must send an acknowledgment. If the agent does not receive an acknowledgment, it retransmits the Inform (up to a configurable number of retries). This adds reliability at the cost of more agent-side state and more network traffic. Despite this improvement, many deployments still use traps because of their simplicity and lower resource requirements on the agent.

Common trap types defined in standard MIBs include:

linkDown / linkUp -- interface state changes. These are the most frequently generated traps in any network and are critical for fault detection.
bgpEstablished / bgpBackwardTransition -- BGP peer state transitions, defined in the BGP4-MIB (RFC 4273). These traps fire when a BGP session transitions to or from the Established state.
coldStart / warmStart -- device reboot notifications.
authenticationFailure -- an SNMP request was rejected due to an invalid community string or credential. This is useful for detecting unauthorized polling or brute-force community string guessing.

SNMPv1: The Original Protocol

SNMPv1, defined in RFC 1157, is the original version and is still encountered on legacy devices. Its security model is trivially weak: authentication consists of a community string -- a plaintext password sent in every SNMP packet. The default community strings are public for read-only access and private for read-write access. These defaults are so well-known that they are the first thing any network scanner tries.

SNMPv1 has no encryption, no message integrity verification, and no protection against replay attacks. The community string is visible to anyone who can capture network traffic. If an attacker can sniff a single SNMP packet, they have the community string and can poll the device (or, worse, issue SET commands if the read-write community is captured). SNMPv1 also lacks the GETBULK operation, making large MIB walks painfully slow.

Error reporting in SNMPv1 is limited to a handful of generic error codes: noSuchName, tooBig, readOnly, badValue, and genErr. If a multi-variable GET request fails for one OID, the entire request fails -- there is no per-variable error reporting.

SNMPv2c: Better Protocol, Same Bad Security

SNMPv2c (the "c" stands for "community-based") was standardized in RFCs 3416-3418. It addressed many of SNMPv1's protocol limitations while deliberately keeping the same insecure community string authentication. The motivation was pragmatic: the original SNMPv2 specification (the "party-based" SNMPv2p) included a complex security framework that was so difficult to implement and deploy that vendors largely ignored it. SNMPv2c was a compromise -- better protocol operations, same simple (insecure) authentication.

Key improvements over SNMPv1:

GETBULK: dramatically reduces the number of round trips needed to walk large tables.
Per-variable error reporting: if one OID in a multi-variable request fails, the other variables still return their values. The failing variable returns an error indication (noSuchObject, noSuchInstance, or endOfMibView) in its value field rather than failing the entire PDU.
Inform: reliable notifications with acknowledgment.
64-bit counters: SNMPv1's Counter32 type wraps at 2^32 (approximately 4.3 billion). On a 10 Gbps interface, a 32-bit byte counter wraps every ~3.4 seconds, making it useless for accurate traffic measurement. SNMPv2c introduced Counter64, which wraps at 2^64 -- roughly 584 years at 10 Gbps. The ifHCInOctets and ifHCOutOctets objects (HC = "high capacity") in the IF-MIB use Counter64 and are essential for monitoring modern high-speed interfaces.

Despite its security shortcomings, SNMPv2c remains the most widely deployed SNMP version in 2026. The combination of easy configuration (just set a community string) and broad device support means that many networks run SNMPv2c and mitigate the security risks through network segmentation -- restricting SNMP traffic to management VLANs and ACLs that limit which source IPs can query the agent.

SNMPv3: Authentication, Encryption, and the User-Based Security Model

SNMPv3, defined in RFCs 3410-3418, finally added proper security to SNMP. Its User-Based Security Model (USM) provides three security levels:

noAuthNoPriv: username-only identification, no authentication or encryption. Equivalent to SNMPv2c security (useless except in trusted lab environments).
authNoPriv: messages are authenticated using HMAC-MD5 or HMAC-SHA (SHA-1, SHA-224, SHA-256, SHA-384, SHA-512). This prevents forgery and ensures the message has not been tampered with, but the contents are still visible on the wire.
authPriv: messages are both authenticated and encrypted. Encryption options include DES (deprecated, 56-bit key), 3DES, AES-128, AES-192, and AES-256. The authPriv level with SHA-256 and AES-256 is the only configuration that provides meaningful security by modern standards.

USM also includes timeliness verification: each SNMPv3 engine maintains a monotonically increasing engineBoots counter (incremented on each reboot) and an engineTime counter (seconds since last boot). Messages are rejected if their timestamp differs from the local clock by more than 150 seconds, which prevents replay attacks. The engine discovery process (defined in RFC 3414) requires the manager to first query the agent's engineID, engineBoots, and engineTime before it can send authenticated requests.

SNMPv3 also introduces the View-Based Access Control Model (VACM), defined in RFC 3415. VACM allows administrators to define fine-grained access policies: which users can access which OID subtrees, at which security level, and with what operations (GET, SET, NOTIFY). This is a dramatic improvement over the all-or-nothing community string model of v1/v2c. For example, you can grant a monitoring user read-only access to interface counters and system information while denying access to the configuration MIB subtrees.

# SNMPv3 query with authPriv security level
$ snmpget -v3 -u monitorUser \
    -l authPriv \
    -a SHA-256 -A 'MyAuthPassphrase' \
    -x AES-256 -X 'MyPrivPassphrase' \
    10.0.0.1 sysUpTime.0
SNMPv2-MIB::sysUpTime.0 = Timeticks: (239482100) 27 days, 17:13:41.00

Despite its security advantages, SNMPv3 adoption has been slow. The configuration is significantly more complex than setting a community string: each user needs authentication and privacy passphrases, the engine discovery process adds latency, and not all devices support the stronger hash/cipher combinations. Many network teams use SNMPv3 for write operations and SNMPv2c for read-only polling within a management VLAN, accepting the risk tradeoff for operational simplicity.

SNMP and BGP Monitoring

The BGP4-MIB (RFC 4273) defines SNMP objects for monitoring BGP sessions. Key objects include:

bgpPeerTable -- one row per BGP peer, containing the peer's IP address, AS number, state (Idle, Connect, Active, OpenSent, OpenConfirm, Established), time since the last state transition, and the number of prefixes received.
bgpPeerState -- the FSM state of each peer. Monitoring systems poll this to detect peer flaps. A transition from Established to any other state triggers a bgpBackwardTransition trap.
bgpPeerFsmEstablishedTransitions -- a counter of how many times the peer has transitioned to Established. A high count indicates a flapping peer.
bgpPeerInTotalMessages / bgpPeerOutTotalMessages -- total BGP messages exchanged, useful for detecting unusually high update rates that might indicate a route leak or hijack.

For large-scale BGP monitoring, SNMP polling of the BGP4-MIB is often supplemented by BMP (BGP Monitoring Protocol), which provides real-time streaming of BGP updates without the polling overhead. However, SNMP remains the primary method for monitoring BGP peer health (up/down) and collecting basic session metrics, while BMP is used for the full routing table and update stream.

Counter Wrapping and Polling Intervals

SNMP counters are monotonically increasing values that reset to zero when they wrap (overflow) or when the device reboots. The monitoring system must handle these edge cases to compute accurate rates.

For a Counter32 value like ifInOctets, the formula for computing throughput between two polls is:

rate = (current_value - previous_value) / interval_seconds

# If counter wrapped (current < previous):
rate = (2^32 - previous_value + current_value) / interval_seconds

A device reboot resets all counters to zero. The monitoring system detects this by comparing sysUpTime -- if the current sysUpTime is less than the previous poll, the device rebooted, and all delta calculations should be discarded for that interval.

Polling interval selection is critical. A 5-minute interval is standard for most metrics, but high-speed interfaces require more careful consideration. On a 100 Gbps link, a Counter32 octet counter wraps every ~0.34 seconds -- far faster than any practical polling interval. This is why Counter64 (ifHCInOctets) is mandatory for high-speed interface monitoring. Even Counter64 is safe only for extremely long intervals or extremely high link speeds (at 100 Tbps, Counter64 wraps in approximately 5.8 years).

SNMP Security Best Practices

SNMP has been the vector for numerous real-world security incidents. Misconfigured community strings have enabled attackers to map internal network topologies, read routing tables, enumerate users, and even modify device configurations via SET operations. Security best practices include:

Use SNMPv3 with authPriv whenever possible. SHA-256 + AES-256 is the minimum acceptable configuration. Avoid MD5 and DES -- both are cryptographically broken.
Restrict SNMP access with ACLs on the device. Only allow SNMP queries from known management station IPs. Configure this in the device's SNMP configuration, not just at the network firewall, for defense in depth.
Isolate management traffic on a dedicated management VLAN or out-of-band network. SNMP traffic should never traverse production data paths.
Change default community strings immediately. "public" and "private" should never be used in production. If you must use SNMPv2c, use a long, random community string and treat it as a password.
Disable SNMP write access unless explicitly required. Most monitoring use cases only need read access. The community string for read-write access (or SNMPv3 write permissions) should be different from the read-only credential.
Monitor for SNMP-based reconnaissance. Unexpected SNMP queries from non-management IPs are a strong indicator of compromise or scanning. The authenticationFailure trap can provide early warning.

SNMP Performance and Scalability

In large networks (thousands of devices with tens of thousands of interfaces), SNMP polling can become a significant load on both the monitoring system and the network. A single polling cycle that walks the ifTable on every device can generate millions of SNMP packets. Key strategies for scaling SNMP monitoring:

GETBULK with appropriate max-repetitions: reduces round trips by 10-50x compared to GETNEXT walking. Set max-repetitions to match the expected table size (typically 50-100).
Stagger polling: spread poll cycles across the interval to avoid thundering herd effects. Instead of polling all 5,000 devices at exactly T+0, spread them across the 5-minute interval.
Poll only what you need: query specific OIDs rather than walking entire subtrees. If you only need interface counters, GET the specific ifHCInOctets and ifHCOutOctets OIDs rather than walking the entire ifTable.
Use 64-bit counters: polling ifHCInOctets instead of ifInOctets avoids the need for counter-wrap detection logic and allows longer polling intervals on high-speed links.
Distribute polling: run multiple SNMP poller instances, each responsible for a subset of devices. Tools like Zabbix proxies and LibreNMS distributed pollers support this natively.

SNMP vs. Modern Alternatives

SNMP has been the lingua franca of network monitoring for over 30 years, but modern alternatives are increasingly displacing it for specific use cases:

gNMI (gRPC Network Management Interface): a streaming telemetry protocol based on gRPC. Instead of polling, devices push counter updates at configured intervals (or on change). This eliminates the polling overhead and provides near-real-time visibility. gNMI uses YANG data models instead of MIBs and Protocol Buffers for encoding instead of BER/ASN.1.
NETCONF/RESTCONF: XML/JSON-based configuration and operational state retrieval protocols. NETCONF (RFC 6241) runs over SSH and provides transactional configuration capabilities that SNMP SET cannot match. RESTCONF (RFC 8040) exposes the same YANG models over HTTPS/REST.
Streaming telemetry: vendor-specific implementations (Cisco MDT, Juniper JTI, Arista gNMI) push structured data over gRPC or UDP at sub-second intervals. For traffic monitoring, this provides far better granularity than 5-minute SNMP polls.
NetFlow/sFlow/IPFIX: for traffic analysis and flow-level visibility, flow protocols provide information that SNMP counters cannot -- per-flow source/destination, protocol breakdown, and conversation-level statistics.

Despite these alternatives, SNMP is not going away. It remains the only truly universal protocol supported by virtually every network device from every vendor. A 15-year-old switch that has never received a firmware update still speaks SNMP. Heterogeneous networks with equipment from five vendors and three decades of deployment will rely on SNMP for the foreseeable future. The practical reality is that most monitoring stacks use SNMP as the baseline for device health and interface counters, supplemented by streaming telemetry for high-resolution traffic data and NETCONF/gNMI for configuration management.

SNMP in DNS and Internet Infrastructure Monitoring

SNMP plays a critical role in monitoring DNS infrastructure. BIND, the most widely deployed authoritative DNS server, exposes query rates, cache hit ratios, zone transfer statistics, and recursive resolution performance through SNMP. DNS operators use SNMP polling to detect anomalous query rates that might indicate a DDoS attack or DNS amplification abuse, and SNMP traps to alert on zone transfer failures that could indicate a misconfigured secondary server.

At Internet Exchange Points (IXPs), SNMP is the standard method for monitoring port utilization. IXP operators poll ifHCInOctets and ifHCOutOctets on every participant's switch port to generate the traffic graphs that are publicly displayed on IXP websites. These graphs -- showing traffic exchanged between ASes -- are built entirely from SNMP counter data collected at 5-minute intervals.

See It in Action

SNMP is the backbone of network monitoring infrastructure worldwide. The very routers that carry BGP routes across the internet are monitored via SNMP -- their interface counters, BGP peer states, and system health metrics are continuously polled by network operations centers. When a BGP session drops, SNMP is often the first system to detect it, firing a bgpBackwardTransition trap before the routing table has fully converged.

Use the god.ad BGP Looking Glass to explore the BGP routing infrastructure that carries SNMP monitoring traffic across the internet. Look up the AS numbers of major network monitoring platform providers, or trace the BGP paths between your monitoring server and the devices it polls.