How VPC Networking Works: Subnets, Route Tables, and Gateways in the Cloud

A Virtual Private Cloud (VPC) is a logically isolated section of a cloud provider's network where you deploy resources with full control over IP addressing, routing, and access policies. It is the cloud equivalent of a physical data center's network -- you define subnets, configure route tables, attach gateways, and enforce security rules -- except the infrastructure is software-defined, API-driven, and spans multiple availability zones. Every major cloud provider (AWS, GCP, Azure) implements VPCs, and while the terminology differs slightly, the core architecture is remarkably consistent. Understanding VPC networking is essential for anyone running production workloads in the cloud, because misconfigured VPCs are behind a large fraction of cloud security incidents and outages.

The VPC as an Isolated Network Domain

When you create a VPC, you assign it a CIDR block -- a contiguous range of private IP addresses from the RFC 1918 space (10.0.0.0/8, 172.16.0.0/12, or 192.168.0.0/16) or, in some cases, publicly routable addresses. This CIDR block defines the total address space available for all resources within the VPC. AWS allows VPCs with a primary CIDR between /16 and /28, and you can add secondary CIDR blocks later. GCP takes a different approach: VPCs are global objects, and subnets within them can have independent CIDR ranges in different regions.

The isolation is real. At the network layer, a VPC is a separate broadcast domain. Traffic between two VPCs does not flow unless you explicitly configure connectivity (peering, transit gateways, or VPN). This is enforced at the hypervisor level -- the virtual switch on each physical host only forwards packets to destinations within the same VPC, or to an explicitly connected gateway. The underlying mechanism varies by provider, but it typically involves some form of encapsulation: AWS uses a proprietary encapsulation similar to VXLAN to tunnel VPC traffic over the physical network, tagging packets with a VPC identifier so the hypervisor can enforce isolation.

Every VPC comes with an implicit router. You never see it as a discrete resource, but it exists at every subnet boundary, handles inter-subnet routing within the VPC, and consults route tables to decide where to send traffic. This implicit router is what makes the "virtual data center" abstraction work -- it provides Layer 3 connectivity between subnets without you having to deploy and manage router instances.

Subnets: Public, Private, and Isolated

A subnet is a partition of the VPC's CIDR block, associated with a specific availability zone (AZ). You carve the VPC CIDR into smaller blocks and assign each to a subnet. For example, a 10.0.0.0/16 VPC might be divided into:

The distinction between public, private, and isolated subnets is not an inherent property of the subnet -- it is determined entirely by routing. A public subnet has a route table entry that sends 0.0.0.0/0 traffic to an internet gateway (IGW). Instances in this subnet can have public IPs and communicate directly with the internet. A private subnet routes 0.0.0.0/0 to a NAT gateway, allowing outbound internet access (for software updates, API calls) without exposing instances to inbound connections. An isolated subnet has no default route to the internet at all -- traffic can only flow within the VPC or to explicitly peered networks.

VPC Architecture: 10.0.0.0/16 Internet IGW VPC 10.0.0.0/16 Public Subnets 10.0.1.0/24 (AZ-a) ALB, Bastion 10.0.2.0/24 (AZ-b) ALB, Bastion NAT GW Private Subnets 10.0.10.0/24 (AZ-a) App servers 10.0.20.0/24 (AZ-b) App servers Isolated Subnets (no internet route) 10.0.100.0/24 (AZ-a) RDS, ElastiCache 10.0.200.0/24 (AZ-b) RDS, ElastiCache Route: 0.0.0.0/0 -> IGW Public route table Route: 0.0.0.0/0 -> NAT GW Private route table Route: local only Isolated route table All route tables include: 10.0.0.0/16 -> local (implicit, intra-VPC routing)

Each subnet is bound to a single AZ. This is a critical design constraint: if AZ-a goes down, subnets in AZ-a become unavailable. For high availability, you deploy redundant resources across multiple AZs, with subnets in each. Cloud providers reserve the first four addresses and the last address of each subnet for internal use (network address, VPC router, DNS, future use, and broadcast), so a /24 subnet provides 251 usable addresses, not 254.

Route Tables and the Implicit Router

Every subnet is associated with exactly one route table (though one route table can be shared across multiple subnets). A route table is an ordered list of destination CIDR/target pairs. The VPC router evaluates routes using longest-prefix matching, exactly as a hardware router would.

Every route table automatically includes a local route for the VPC CIDR (e.g., 10.0.0.0/16 -> local). This route cannot be deleted and ensures that all subnets within the VPC can communicate with each other by default. Additional routes can point to:

A common gotcha: if you add a more specific route that overlaps with the local route, the more specific route wins due to longest-prefix matching. This can inadvertently break intra-VPC connectivity if you are not careful. For example, adding a route for 10.0.10.0/24 -> some-appliance will intercept traffic to that subnet even from within the VPC.

Internet Gateways and NAT Gateways

An Internet Gateway (IGW) is a horizontally scaled, redundant, fully managed component that performs two functions: it acts as a target in route tables for internet-bound traffic, and it performs one-to-one NAT between an instance's private IP and its associated public or Elastic IP. Unlike a NAT gateway, an IGW does not perform port-address translation -- each instance gets a dedicated public IP mapping. The IGW is stateless and does not become a bottleneck; AWS does not impose bandwidth limits on it.

A NAT Gateway performs port-address translation (PAT), allowing many private instances to share a single public IP address for outbound connections. It maintains a connection tracking table, maps each outbound connection to a unique source port on the NAT IP, and rewrites return packets. NAT gateways are AZ-scoped -- you need one per AZ for high availability. AWS NAT gateways support up to 55,000 simultaneous connections per destination IP and can burst to 100 Gbps. The cost model (per-hour plus per-GB data processing) makes NAT gateways one of the most expensive networking components in a typical AWS bill.

For cost optimization, many organizations use VPC endpoints to bypass NAT gateways entirely for traffic to AWS services. A gateway endpoint for S3, for instance, adds a route table entry that sends S3-bound traffic directly through the AWS backbone, avoiding NAT gateway data processing charges.

Security Groups: Stateful Instance-Level Firewalls

A security group is a stateful firewall attached to an elastic network interface (ENI). Every instance, RDS database, Lambda function (in a VPC), and load balancer has one or more security groups. Security group rules define allowed inbound and outbound traffic by protocol, port range, and source/destination. Crucially, security groups are default-deny inbound, default-allow outbound -- if you create a security group with no rules, no inbound traffic is permitted, but all outbound traffic is allowed.

The stateful nature means that if you allow inbound TCP port 443, the return traffic for those connections is automatically permitted regardless of outbound rules. This is implemented via connection tracking at the hypervisor level. The implication: you do not need to create explicit outbound rules for return traffic, and you do not need to worry about ephemeral port ranges.

Security groups have a powerful feature: self-referencing rules. You can create a rule where the source is another security group (or even the same group). For example, a "backend" security group can allow inbound port 8080 from the "load-balancer" security group. This decouples security policy from IP addresses -- instances can be added or removed from either group without updating firewall rules. This is a major advantage over traditional IP-based ACLs, especially in auto-scaling environments where instance IPs are ephemeral.

Network ACLs: Stateless Subnet-Level Firewalls

Network Access Control Lists (NACLs) are stateless firewalls applied at the subnet level. Unlike security groups, NACLs evaluate traffic in both directions independently -- if you allow inbound port 443, you must separately allow the outbound ephemeral port range for return traffic. NACLs are processed in rule number order (lowest first), and the first matching rule determines the action. Each rule can either ALLOW or DENY traffic.

The default NACL allows all inbound and outbound traffic. Custom NACLs default to deny-all. In practice, most organizations use security groups for primary access control and reserve NACLs for coarse-grained subnet-level restrictions -- for example, blocking an IP range known to be malicious, or restricting traffic between subnet tiers as a defense-in-depth measure.

The stateless nature of NACLs is the most common source of configuration errors. Consider HTTPS traffic: you need an inbound rule allowing TCP 443 and an outbound rule allowing TCP on ephemeral ports 1024-65535 (the range where the OS allocates source ports for reply traffic). Forgetting the outbound rule breaks all HTTPS connections. This is why security groups (stateful) are preferred for most use cases -- they handle return traffic automatically.

Security Groups vs Network ACLs Security Groups Scope: ENI (instance-level) Stateful: Yes (return traffic auto) Rules: ALLOW only Evaluation: All rules (union) Default: Deny all inbound Source/Dst: SG ID or CIDR Instance SG wraps each ENI Inbound: checked Return: auto-allowed Network ACLs Scope: Subnet-level Stateful: No (both dirs needed) Rules: ALLOW and DENY Evaluation: Rule # order (first) Default: Allow all (default ACL) Source/Dst: CIDR only Subnet boundary NACL wraps entire subnet Inbound Return: explicit Must allow ephemeral ports for return traffic

VPC Peering

VPC peering creates a direct network connection between two VPCs, allowing resources in either VPC to communicate using private IP addresses as if they were on the same network. Peering works across accounts and across regions (inter-region peering). The connection is established by sending a peering request from one VPC to another; the owner of the target VPC must accept the request before traffic can flow.

VPC peering has important limitations:

Despite these limitations, peering is useful for simple point-to-point connectivity -- connecting a production VPC to a shared services VPC (logging, monitoring, CI/CD), or enabling cross-account access between teams.

Transit Gateway: Hub-and-Spoke at Scale

AWS Transit Gateway (TGW) solves the peering scalability problem. It acts as a regional network hub -- you attach VPCs, VPN connections, Direct Connect gateways, and even peered transit gateways to it, and it handles routing between all attached networks. Instead of O(n^2) peering connections, you need O(n) attachments.

Transit gateways support route tables with route propagation. Attached VPCs and VPNs can automatically propagate their routes into the TGW route table, or you can define static routes for more control. Multiple route tables enable network segmentation: you might have a "production" route table and a "development" route table, with different VPCs attached to each, effectively creating isolated routing domains within the same transit gateway.

Transit gateways support inter-region peering, enabling you to build a global network backbone across AWS regions. Traffic between peered transit gateways stays on the AWS backbone -- it does not traverse the public internet. Combined with Direct Connect, a transit gateway can serve as the central hub for hybrid cloud networking, connecting dozens of VPCs and on-premises data centers through a single architecture.

The cost model is per-attachment (hourly) plus per-GB data processing, which can be significant at scale. For simple two-VPC connectivity, peering is cheaper. Transit gateways become cost-effective when you have more than 3-4 VPCs that need mutual connectivity.

VPC Endpoints and PrivateLink

VPC endpoints allow resources in your VPC to connect to AWS services without going through the internet, NAT gateway, or VPN. There are two types:

AWS PrivateLink is the underlying technology for interface endpoints, but it is also a service exposure mechanism. You can create your own PrivateLink service by placing a Network Load Balancer in front of your application. Other AWS accounts can then create interface endpoints to connect to your service. Traffic flows over the AWS backbone, the consumer never sees your VPC's IP addresses, and you control access via endpoint policies. This is how many SaaS products (Datadog, Snowflake, MongoDB Atlas) offer private connectivity to their services.

VPC Flow Logs

VPC Flow Logs capture metadata about IP traffic flowing through network interfaces in your VPC. Each flow log record includes the source IP, destination IP, source port, destination port, protocol, packet count, byte count, action (ACCEPT/REJECT), and the interface ID. Flow logs can be attached at three levels: VPC (captures all traffic), subnet, or individual ENI.

Flow logs are essential for security analysis, troubleshooting connectivity issues, and compliance. Common uses include:

Flow logs do not capture the packet payload -- only metadata. They are sampled (not every packet is logged), and there is a delay of several minutes before records are available. For full packet capture, you need traffic mirroring, which copies actual packets to a monitoring appliance.

Multi-Cloud VPC Equivalents

While the term "VPC" originated with AWS, all major cloud providers implement the same concept with slightly different names and architectural choices:

The most significant architectural difference is GCP's global VPC model. In AWS and Azure, a VPC/VNet is confined to a single region, and cross-region connectivity requires peering or transit gateways. In GCP, a single VPC spans all regions, and subnets in different regions can communicate without any additional configuration. This simplifies multi-region deployments but requires more careful CIDR planning since all subnets share the same routing domain.

Common VPC Design Patterns

Production VPC architectures typically follow one of these patterns:

VPC Networking and the Global Routing Table

VPC networking operates at a layer above the public internet routing system, but the two are deeply interconnected. When an instance in a public subnet communicates with the internet, its traffic exits through the IGW, which translates its private IP to a public IP that belongs to the cloud provider's autonomous system. That public IP is part of a prefix announced via BGP by the provider's edge routers to the global routing table. AWS, for instance, announces its prefixes from AS16509 and AS14618, which you can observe in the global routing table. The cloud provider's backbone handles routing between its regions, points of presence, and internet exchange points, but from the perspective of the rest of the internet, your VPC's public IPs are just addresses within the provider's BGP announcements. Use the god.ad looking glass to look up your cloud instances' public IPs and trace the AS path from any vantage point back to the provider's network.

See BGP routing data in real time

Open Looking Glass
More Articles
What is DNS? The Internet's Phone Book
What is an IP Address?
IPv4 vs IPv6: What's the Difference?
What is a Network Prefix (CIDR)?
How Does Traceroute Work?
What is a CDN? Content Delivery Networks Explained