How Email Works: SMTP, IMAP, and the Mail System

Every day, over 300 billion emails traverse the internet. Behind the simplicity of clicking "send" lies a stack of protocols, DNS lookups, authentication checks, and relay hops that have evolved over four decades. Understanding how email works — from the SMTP handshake to DKIM signature verification — reveals one of the most complex and heavily abused systems on the internet, and explains why so much engineering effort goes into making sure a legitimate message actually reaches the inbox.

The SMTP Protocol: How Email Gets Sent

The Simple Mail Transfer Protocol (SMTP), defined in RFC 5321, is the protocol used to send email between servers. SMTP operates over TCP, traditionally on port 25 (server-to-server relay), port 587 (authenticated submission from mail clients), or port 465 (implicit TLS submission). Despite its name, the protocol is anything but simple in modern practice.

An SMTP conversation is a text-based dialogue between a sending server (the client) and a receiving server. Here is what a typical session looks like:

S: 220 mx.example.com ESMTP ready
C: EHLO mail.sender.com
S: 250-mx.example.com Hello mail.sender.com
S: 250-SIZE 52428800
S: 250-STARTTLS
S: 250-AUTH PLAIN LOGIN
S: 250 8BITMIME
C: STARTTLS
S: 220 Ready to start TLS
  [TLS handshake occurs]
C: EHLO mail.sender.com
C: MAIL FROM:<[email protected]>
S: 250 OK
C: RCPT TO:<[email protected]>
S: 250 OK
C: DATA
S: 354 Start mail input
C: From: Alice <[email protected]>
C: To: Bob <[email protected]>
C: Subject: Meeting tomorrow
C: Date: Thu, 24 Apr 2026 10:30:00 -0700
C: MIME-Version: 1.0
C: Content-Type: text/plain; charset=UTF-8
C:
C: Hi Bob, are we still on for tomorrow?
C: .
S: 250 OK: queued as ABC123
C: QUIT
S: 221 Bye

Each command serves a specific purpose. EHLO (Extended HELO) identifies the sending server and requests the list of supported extensions. MAIL FROM specifies the envelope sender — the address that bounces go to, which may differ from the From: header the recipient sees. RCPT TO specifies the envelope recipient. DATA begins the message body, terminated by a lone dot on a line by itself.

The distinction between envelope addresses (used by SMTP for routing) and header addresses (displayed to the user) is critical to understanding email spoofing. SMTP itself places no restriction on what appears in the From: header — this is why authentication protocols like SPF, DKIM, and DMARC were invented.

MX Records and DNS Lookup

Before an email can be delivered, the sending server must discover where to deliver it. This is where DNS comes in. The sending server extracts the domain part of the recipient address (the part after the @), which belongs to some top-level domain, and queries that domain's MX (Mail Exchanger) records to find the mail servers responsible for it.

For example, querying the MX records for gmail.com returns something like:

gmail.com.  MX  5  gmail-smtp-in.l.google.com.
gmail.com.  MX  10 alt1.gmail-smtp-in.l.google.com.
gmail.com.  MX  20 alt2.gmail-smtp-in.l.google.com.
gmail.com.  MX  30 alt3.gmail-smtp-in.l.google.com.
gmail.com.  MX  40 alt4.gmail-smtp-in.l.google.com.

The number before each hostname is the priority (lower is more preferred). The sending server tries the lowest-priority MX first; if it is unreachable, it falls back to the next one. This provides redundancy. If no MX record exists, the sender falls back to the domain's A or AAAA record as a last resort, per RFC 5321.

Each of those MX hostnames resolves to IP addresses via A/AAAA records, and those IP addresses are reachable via BGP routes. You can look up any mail server's IP to see which autonomous system hosts it — for instance, Google (AS15169) operates Gmail's mail exchangers. The security of this DNS lookup is where DNSSEC becomes important: without it, an attacker could poison DNS responses and redirect mail to a rogue server.

Sender (Mail Client) SMTP :587 Sender MTA (Outbound) MX query DNS Server MX + A records SMTP :25/TLS Internet Receiver MTA (Inbound MX) SPF/DKIM Auth Check SPF+DKIM+DMARC Recipient (IMAP/POP3) Email Delivery Flow Sender MUA -> Submission (587) -> MTA relay (25) -> MX Lookup -> Inbound MTA -> Authentication -> Mailbox 1 Client submits via SMTP :587 (authenticated) 2 Sender MTA looks up MX records, connects to receiver on :25 3 Receiver validates SPF, DKIM, DMARC; delivers to mailbox via IMAP/POP3

Mail Routing and Relaying

Email rarely travels directly from sender to recipient. It passes through a chain of Mail Transfer Agents (MTAs). The sending user's mail client (MUA — Mail User Agent) submits the message to their organization's outbound MTA via authenticated SMTP on port 587. That MTA performs DNS lookups, applies outbound policies (rate limits, DKIM signing, content scanning), and relays the message to the recipient's inbound MTA over port 25.

In many environments, additional relay hops exist. A large organization might route outbound mail through a dedicated gateway (like Proofpoint or Mimecast) for compliance scanning before it reaches the internet. On the receiving side, the MX record might point to a cloud security gateway that inspects the message before forwarding it to the actual mail server. Each relay adds a Received: header to the message, creating an auditable chain that records every server the message touched.

The concept of an open relay — a mail server that forwards messages for anyone, regardless of authentication — was once common and now represents a serious misconfiguration. Open relays are aggressively blocklisted because spammers exploit them to send mail that appears to originate from a trusted network.

IMAP vs POP3: Retrieving Email

SMTP handles sending and relaying. For retrieving email from a mailbox, two protocols exist:

IMAP (Internet Message Access Protocol), defined in RFC 9051, is the modern standard. IMAP keeps messages on the server and synchronizes state across multiple devices. When you read an email on your phone, IMAP marks it as read on the server so your laptop shows the same state. IMAP supports folders, server-side search, partial message fetching (downloading headers without the body), and flags. It operates on port 993 (implicit TLS) or 143 (STARTTLS).

POP3 (Post Office Protocol version 3), defined in RFC 1939, is the older and simpler protocol. POP3 downloads messages to the client and, by default, deletes them from the server. This made sense in the era of dial-up connections and single-device access. POP3 operates on port 995 (implicit TLS) or 110 (STARTTLS). While still supported by most servers, POP3 is increasingly rare in practice.

Modern email services like Gmail, Outlook, and Apple Mail primarily use IMAP (or proprietary sync protocols like Exchange ActiveSync and Microsoft Graph) to provide the seamless multi-device experience users expect. The JMAP (JSON Meta Application Protocol), defined in RFC 8620, is a newer alternative designed to replace IMAP with a more efficient, JSON-based API.

Message Format: RFC 5322 and MIME

The format of an email message itself is defined by RFC 5322 (Internet Message Format). A message consists of header fields followed by a blank line and then the body. Required headers include From:, Date:, and at least one destination (To:, Cc:, or Bcc:). Other common headers include Subject:, Message-ID:, Reply-To:, and In-Reply-To: (for threading).

The original RFC 822 (superseded by 5322) only supported 7-bit ASCII text. MIME (Multipurpose Internet Mail Extensions), defined across RFCs 2045-2049, extended email to support:

A modern HTML email with an attachment might have a MIME structure like: multipart/mixed containing a multipart/alternative (with text/plain and text/html parts) plus an application/pdf attachment. This nesting is why email parsers are notoriously complex.

Email Authentication: SPF, DKIM, and DMARC

SMTP's original design had no concept of sender authentication. Anyone could (and still can, at the protocol level) send a message claiming to be from any address. Three complementary standards now form the email authentication stack:

SPF (Sender Policy Framework)

SPF, defined in RFC 7208, lets a domain publish a DNS TXT record listing the IP addresses and servers authorized to send mail for that domain. When a receiving server gets a message with an envelope sender in @example.com, it queries the SPF record for example.com and checks whether the sending server's IP is listed.

A typical SPF record looks like:

example.com.  TXT  "v=spf1 ip4:198.51.100.0/24 include:_spf.google.com -all"

This says: accept mail from the 198.51.100.0/24 range, accept mail from servers authorized by Google's SPF record (for Google Workspace), and reject everything else (-all). SPF has a limitation: it only validates the envelope sender (MAIL FROM), not the From: header the user sees. A phisher can pass SPF by using their own domain as the envelope sender while spoofing a different From: header.

DKIM (DomainKeys Identified Mail)

DKIM, defined in RFC 6376, adds a cryptographic signature to outgoing messages. The sending server signs specific headers and the message body using a private key, and publishes the corresponding public key in a DNS TXT record. The receiving server retrieves the public key and verifies the signature.

A DKIM signature header looks like:

DKIM-Signature: v=1; a=rsa-sha256; d=example.com; s=selector1;
  h=from:to:subject:date:message-id;
  bh=abcdef123456...=;
  b=GHIJKL789012...=

The d= field identifies the signing domain, s= is the selector used to look up the public key (at selector1._domainkey.example.com), h= lists the signed headers, bh= is the body hash, and b= is the signature itself. DKIM proves that the message was not modified in transit and that it was sent by a server with access to the domain's private key.

DMARC (Domain-based Message Authentication, Reporting, and Conformance)

DMARC, defined in RFC 7489, ties SPF and DKIM together and adds a policy layer. It solves the alignment problem: DMARC requires that the domain in the From: header (which the user sees) matches either the SPF-validated envelope domain or the DKIM signing domain. This closes the gap that allowed phishers to pass SPF or DKIM while spoofing the From: header.

A DMARC record is published at _dmarc.example.com:

_dmarc.example.com.  TXT  "v=DMARC1; p=reject; rua=mailto:[email protected]; pct=100"

The p= field specifies the policy: none (monitor only), quarantine (send to spam), or reject (block entirely). The rua= field specifies where aggregate reports should be sent — these reports tell domain owners who is sending mail on their behalf, both legitimately and fraudulently.

SPF / DKIM / DMARC Validation Flow Incoming Email From: [email protected] SPF Check Query: example.com TXT Is sender IP authorized? DKIM Check Query: sel._domainkey.example.com Verify signature with public key DMARC Check Query: _dmarc.example.com SPF or DKIM aligned with From: domain? PASS Deliver to inbox QUARANTINE Send to spam REJECT Bounce message DNS TXT Records SPF, DKIM pub key, DMARC policy SPF (IP check) DKIM (signature) DMARC (policy)

ARC (Authenticated Received Chain)

ARC, defined in RFC 8617, solves a problem that mailing lists and forwarding services create for DMARC. When a mailing list receives a message and re-sends it to subscribers, SPF fails (the list server's IP is not authorized for the original sender's domain) and DKIM may break (the list might modify headers or add a footer). DMARC then sees a failure and might reject the message.

ARC preserves the authentication results from each hop. Each intermediary adds three ARC headers: ARC-Authentication-Results (what checks passed at that hop), ARC-Message-Signature (a DKIM-like signature of the message), and ARC-Seal (a signature chain linking all ARC sets). The final receiver can inspect the ARC chain to decide whether to trust the message despite DMARC failure, based on whether the intermediaries in the chain are trusted.

STARTTLS and Encryption in Transit

SMTP was designed as a plaintext protocol. STARTTLS (RFC 3207) is an extension that upgrades an existing plaintext SMTP connection to TLS encryption. After the initial EHLO exchange, the client sends the STARTTLS command, the server responds with 220, and both sides perform a TLS handshake. Subsequent SMTP commands travel over the encrypted channel.

STARTTLS has a critical vulnerability: it is opportunistic. A man-in-the-middle attacker can strip the STARTTLS capability from the server's EHLO response, forcing the connection to remain in plaintext. The sending server has no way to know that encryption should have been available. This is a STRIPTLS attack.

MTA-STS (Mail Transfer Agent Strict Transport Security)

MTA-STS (RFC 8461) closes this gap. A domain publishes a policy (via a well-known HTTPS URL and a DNS TXT record) declaring that all mail to that domain must use TLS with a valid certificate. Sending servers that support MTA-STS cache this policy and refuse to deliver mail over plaintext connections, even if STARTTLS is stripped.

The related DANE (DNS-Based Authentication of Named Entities) protocol uses DNSSEC-signed TLSA records to publish the expected TLS certificate for a mail server, providing even stronger protection against certificate manipulation.

TLS Reporting (TLSRPT)

RFC 8460 defines TLS Reporting, which lets domain owners receive reports about TLS negotiation failures when other servers try to deliver mail to them. This helps identify misconfigured certificates, downgrade attacks, and connectivity problems.

BIMI: Brand Indicators for Message Identification

BIMI is a specification that lets organizations display their brand logo next to authenticated emails in supporting mail clients. BIMI builds on DMARC — a domain must have a DMARC policy of quarantine or reject to qualify. The domain publishes a BIMI DNS record pointing to its logo (an SVG file in a specific format) and, for stronger verification, a Verified Mark Certificate (VMC) issued by a certificate authority that validates trademark ownership.

Gmail, Apple Mail, and Yahoo Mail support BIMI. When you see a brand logo next to a sender's name instead of a generic avatar, that is BIMI in action — it provides visual confirmation that the email was authenticated and the sender is who they claim to be.

Modern Email Infrastructure

The mail server landscape spans open-source software, commercial appliances, and cloud platforms:

Postfix is the most widely deployed open-source MTA. Written by Wietse Venema as a secure alternative to Sendmail, Postfix handles mail routing, relay, and delivery with a modular architecture where each function runs as a separate process with minimal privileges. It powers mail systems from small organizations to major ISPs.

Microsoft Exchange is the dominant enterprise mail platform, offering integrated calendaring, contacts, and mailbox management. Exchange Server runs on-premises, while Exchange Online is the backbone of Microsoft 365's email service. Exchange uses MAPI (Messaging Application Programming Interface) and Exchange ActiveSync for client connectivity, alongside standard IMAP and SMTP.

Google Workspace (formerly G Suite) runs Gmail's infrastructure for organizations. Google's mail servers handle delivery for millions of domains — when you look up the MX records for a Google Workspace customer, they all point to *.google.com mail exchangers running within Google's network (AS15169).

Other notable MTAs include Exim (default on Debian systems, highly configurable), OpenSMTPD (from the OpenBSD project, focused on simplicity and security), and Haraka (a Node.js-based MTA designed for high-volume sending). On the delivery/mailbox side, Dovecot is the dominant open-source IMAP/POP3 server, often paired with Postfix.

Spam Filtering Techniques

Spam accounts for roughly 45% of all email traffic. Modern spam filtering uses multiple layers:

Email Deliverability

For legitimate senders, getting mail to the inbox (not the spam folder) is a constant challenge. Deliverability depends on:

How Phishing Exploits Email Trust

Phishing attacks exploit the fundamental trust model of email. Common techniques include:

DMARC with p=reject prevents direct domain spoofing but cannot stop lookalike domains or compromised accounts. BIMI helps by training users to expect a brand logo on legitimate messages — its absence on a lookalike domain is a visual cue that something is wrong.

Email Headers and Tracing the Delivery Path

Every email carries a set of headers that document its journey. While most users never see them, headers are essential for debugging delivery issues, investigating phishing, and understanding mail routing. Key headers to examine:

To trace an email's path through the internet, extract the IP addresses from Received: headers and look them up. Each IP maps to a BGP prefix originated by an autonomous system. This reveals whether the message actually traversed the networks you would expect. A message claiming to be from a US-based company but routed through unexpected networks is a red flag.

The Relationship Between Email and BGP

Email depends on BGP at every stage. The MX records for a domain resolve to IP addresses that are reachable only because BGP carries their routes across the internet. A BGP hijack targeting a mail server's prefix could redirect email to an attacker, intercept messages, or cause delivery failures. This has happened in practice — the 2018 attack on Amazon's Route 53 DNS involved BGP hijacking to redirect DNS queries, which in turn affected email routing.

SPF records contain IP addresses that must be reachable for validation. If BGP routes for those IPs are disrupted, SPF checks fail, potentially causing legitimate mail to be rejected. The security of the entire email authentication system ultimately rests on the security of DNS (which depends on DNSSEC) and BGP (which depends on RPKI).

You can explore the infrastructure behind major email providers by looking up their mail server IPs and autonomous systems:

Key RFCs and Standards

Email is defined by a deep stack of RFCs accumulated over decades:

Look Up Email Infrastructure

Trace the networks behind the world's email systems. Look up mail server IPs and the autonomous systems that carry email traffic across the internet:

See BGP routing data in real time

Open Looking Glass
More Articles
What is DNS? The Internet's Phone Book
What is an IP Address?
IPv4 vs IPv6: What's the Difference?
What is a Network Prefix (CIDR)?
How Does Traceroute Work?
What is a CDN? Content Delivery Networks Explained