How Linux Network Namespaces Work: Container Networking Internals
Linux network namespaces are a kernel feature that provides isolated instances of the network stack -- each namespace gets its own network interfaces, IP addresses, routing tables, firewall rules, and socket tables, completely independent of every other namespace on the same host. Network namespaces are the foundational building block of container networking: every Docker container, every Kubernetes pod, and every network namespace created by CNI plugins is, at its core, a Linux network namespace connected to the host through virtual network devices. Understanding how namespaces work at the kernel level is essential for debugging container networking issues, designing overlay networks, and grasping why tools like ip netns, veth pairs, and bridge devices exist.
What a Network Namespace Isolates
The Linux kernel maintains a set of data structures collectively known as the "network stack." These include the list of network interfaces, the routing table (FIB), the ARP/neighbor cache, Netfilter rules (iptables/nftables), connection tracking (conntrack), socket tables, and traffic control qdiscs. In the default (init) namespace, all these structures are shared by every process on the system.
When you create a new network namespace, the kernel instantiates a completely separate copy of all these structures. A process running inside the new namespace sees only the interfaces, routes, and firewall rules that belong to that namespace. It cannot see or interact with interfaces in other namespaces. This isolation is not achieved through access control (like file permissions) but through structural separation -- the kernel literally maintains separate data structures for each namespace.
A freshly created namespace contains only a loopback interface (lo), and it is down by default. There are no routes, no iptables rules, no interfaces connected to the outside world. The namespace is completely isolated. To make it useful, you must explicitly create virtual network devices and attach them to the namespace.
- Network interfaces -- Each namespace has its own set. An interface belongs to exactly one namespace at a time, but can be moved between namespaces.
- IP addresses -- Addresses are bound to interfaces, which are bound to namespaces. Two namespaces can use the same IP address on different interfaces without conflict.
- Routing table -- Each namespace has its own routing table. A default route in one namespace does not affect routing in another.
- Iptables/nftables -- Firewall rules are per-namespace. Rules in the host namespace do not apply to traffic inside a container namespace (unless the traffic traverses a host interface).
- Socket table -- Listening sockets are namespace-scoped. A server listening on port 80 in one namespace does not conflict with port 80 in another.
- Conntrack table -- Connection tracking state is per-namespace, which affects NAT and stateful firewall behavior.
Creating and Managing Namespaces with ip netns
The ip netns command (part of the iproute2 package) is the standard tool for managing named network namespaces. Under the hood, namespaces are created via the unshare(CLONE_NEWNET) or clone(CLONE_NEWNET) system calls. The ip netns add command creates a bind mount at /var/run/netns/<name> that keeps the namespace alive even when no process is running inside it.
# Create a namespace
ip netns add red
# List namespaces
ip netns list
# Run a command inside the namespace
ip netns exec red ip link list
# Output: only sees "lo" (loopback), and it's DOWN
# Bring up loopback
ip netns exec red ip link set lo up
# Delete the namespace
ip netns delete red
Docker and Kubernetes do not use ip netns to create namespaces. They call the kernel system calls directly and do not create bind mounts in /var/run/netns/. This is why ip netns list does not show container namespaces by default. To inspect a container's namespace with ip netns tools, you can create a symlink:
# Find the container's PID
PID=$(docker inspect -f '{{.State.Pid}}' my-container)
# Create a symlink so ip netns can see it
ln -s /proc/$PID/ns/net /var/run/netns/my-container
# Now you can use ip netns exec
ip netns exec my-container ip addr show
Veth Pairs: Virtual Ethernet Cables
A veth pair (virtual Ethernet pair) is a pair of virtual network interfaces connected like a crossover cable: any packet sent into one end comes out the other. Veth pairs are the primary mechanism for connecting a network namespace to the outside world. One end of the pair lives in the namespace, and the other end lives in the host namespace (or another namespace).
# Create a veth pair
ip link add veth-host type veth peer name veth-ns
# Move one end into the namespace
ip link set veth-ns netns red
# Configure the host end
ip addr add 10.0.0.1/24 dev veth-host
ip link set veth-host up
# Configure the namespace end
ip netns exec red ip addr add 10.0.0.2/24 dev veth-ns
ip netns exec red ip link set veth-ns up
# Test connectivity
ip netns exec red ping 10.0.0.1 # works
ping 10.0.0.2 # works
At this point, the two namespaces can communicate via the 10.0.0.0/24 network. But the namespace has no route to anything beyond 10.0.0.0/24. To give it internet access, you need to set up routing and NAT:
# Add a default route in the namespace, pointing to the host end
ip netns exec red ip route add default via 10.0.0.1
# Enable IP forwarding on the host
sysctl -w net.ipv4.ip_forward=1
# Add a masquerade (SNAT) rule for traffic from the namespace
iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -o eth0 -j MASQUERADE
This is exactly what Docker does for every container: create a veth pair, put one end in the container namespace, put the other end on a bridge, add masquerade rules for outbound traffic, and optionally add DNAT rules for published ports.
Bridge Devices: L2 Switching in Software
A Linux bridge is a virtual Layer 2 switch that connects multiple network interfaces within the same broadcast domain. When you create a bridge and add veth endpoints to it, frames entering any bridge port are forwarded to other ports based on MAC address learning, exactly like a physical Ethernet switch.
# Create a bridge
ip link add br0 type bridge
ip link set br0 up
# Assign an IP to the bridge (acts as default gateway for namespaces)
ip addr add 10.0.0.1/24 dev br0
# Attach veth host ends to the bridge
ip link set veth-host-red master br0
ip link set veth-host-blue master br0
With this configuration, the "red" and "blue" namespaces can communicate with each other through the bridge (they are on the same L2 segment, 10.0.0.0/24), and both can reach the host via the bridge IP (10.0.0.1). The bridge maintains a forwarding database (FDB) that maps MAC addresses to ports, which you can inspect with bridge fdb show.
Docker's default networking mode (bridge mode) does exactly this. The docker0 bridge is created when the Docker daemon starts, and each container's veth host-side interface is attached to docker0. The bridge has an IP address (typically 172.17.0.1) that serves as the default gateway for all containers. Iptables masquerade rules on the host provide containers with internet access.
Routing Between Namespaces Without a Bridge
A bridge is not always necessary. For point-to-point connectivity between two namespaces, you can use veth pairs with routing:
# Create two namespaces with a direct veth link
ip netns add ns1
ip netns add ns2
# Create a veth pair
ip link add veth1 type veth peer name veth2
# Put each end in a different namespace
ip link set veth1 netns ns1
ip link set veth2 netns ns2
# Configure with /30 point-to-point subnets
ip netns exec ns1 ip addr add 10.0.0.1/30 dev veth1
ip netns exec ns1 ip link set veth1 up
ip netns exec ns2 ip addr add 10.0.0.2/30 dev veth2
ip netns exec ns2 ip link set veth2 up
# They can now ping each other
ip netns exec ns1 ping 10.0.0.2
For more complex topologies, you can chain namespaces together with multiple veth pairs and configure routing between them. This is how you build virtual lab environments to test routing protocols, firewall rules, or complex network topologies -- all on a single Linux machine.
macvlan and ipvlan: Alternatives to Bridges
Veth pairs through a bridge involve two virtual interfaces per container and an L2 switching decision at the bridge. For performance-sensitive workloads, Linux provides alternatives:
- macvlan -- Creates sub-interfaces on a physical NIC, each with a unique MAC address. Frames are demultiplexed by the NIC driver based on destination MAC. This bypasses the bridge entirely and delivers packets directly to the correct namespace. macvlan has four modes: private (no inter-container communication), VEPA (traffic goes through the external switch even for local destinations), bridge (local forwarding without an external switch), and passthru (one-to-one mapping).
- ipvlan -- Similar to macvlan but all sub-interfaces share the parent's MAC address. Demultiplexing is done at L3 (by IP address) instead of L2. This is useful in environments where MAC address limits are enforced (some cloud providers limit the number of MAC addresses per interface, and some network switches restrict ports to a single MAC). ipvlan has L2 mode (like a bridge, same subnet) and L3 mode (acts as a router, each sub-interface can have a different subnet).
macvlan and ipvlan are used by some Kubernetes CNI plugins (e.g., Multus, for attaching additional network interfaces to pods) and by Docker's macvlan network driver. They offer lower latency than veth+bridge because there are fewer hops in the packet path, but they have operational caveats: macvlan interfaces cannot communicate with the host's physical interface (a common source of confusion), and ipvlan requires all traffic to go through the host's routing table.
How Docker Uses Network Namespaces
Docker's networking is a direct application of the primitives described above. When you run docker run --net bridge my-image (the default), Docker:
- Creates a new network namespace for the container using
clone(CLONE_NEWNET) - Creates a veth pair
- Moves one end of the veth pair into the container namespace and renames it
eth0 - Attaches the other end to the
docker0bridge - Assigns an IP from the bridge's subnet (e.g., 172.17.0.x/16) to the container's eth0
- Sets the bridge IP as the container's default gateway
- Configures DNS in the container's
/etc/resolv.conf(typically pointing to Docker's embedded DNS at 127.0.0.11) - Adds iptables rules for masquerading (outbound NAT) and port publishing (DNAT)
Docker also supports --net host (container shares the host namespace -- no isolation), --net none (namespace with no connectivity), and --net container:<id> (share another container's namespace). The last mode is critical for Kubernetes.
How Kubernetes Uses Network Namespaces
In Kubernetes, the unit of network identity is the pod, not the container. All containers within a pod share a single network namespace, which means they share IP addresses, port space, and localhost. This is implemented using the --net container: model:
- The kubelet creates a pause container (also called the "sandbox" or "infra" container) -- a tiny container whose sole purpose is to hold the network namespace
- The CNI plugin is invoked to set up networking in the pause container's namespace (create veth pair, assign IP, configure routes)
- All other containers in the pod are created with
--net container:<pause-container-id>, joining the same namespace - The pause container runs indefinitely (it executes
/pause, which just callspause()), keeping the namespace alive even if application containers restart
The CNI (Container Network Interface) plugin is responsible for the actual namespace configuration. Different CNIs use different approaches:
- Flannel -- VXLAN overlay. Creates a veth pair, attaches the host end to a bridge (
cni0), and encapsulates cross-node traffic in VXLAN tunnels. - Calico -- Direct routing (default). Creates a veth pair, attaches the host end directly (no bridge), and uses BGP to distribute pod routes across nodes. Can also use VXLAN or WireGuard for overlay.
- Cilium -- eBPF-based. Creates a veth pair and programs eBPF maps for routing, load balancing, and policy enforcement instead of using iptables or bridges.
- AWS VPC CNI -- Allocates IPs from the VPC using ENIs. Pods get VPC-routable IPs, and the host's routing table directs traffic to the correct veth pair.
Network Namespace Lifecycle and /proc
Every process on Linux has a network namespace, visible as a file in /proc/<pid>/ns/net. Two processes are in the same namespace if their /proc/<pid>/ns/net symlinks point to the same inode. You can enter an existing process's network namespace using nsenter:
# Enter PID 12345's network namespace
nsenter --target 12345 --net ip addr show
# Enter all namespaces of a process (net, pid, mnt, etc.)
nsenter --target 12345 --all bash
A namespace stays alive as long as at least one of the following is true: a process is running inside it, a bind mount exists (as ip netns add creates), or a file descriptor is held open to the namespace file. When all references are gone, the kernel destroys the namespace and reclaims all its resources -- interfaces are returned to the init namespace (if they were moved in) or destroyed (if they were created inside the namespace, like the container end of a veth pair).
The unshare command creates a new namespace and runs a command inside it:
# Create a new network namespace and run a shell
unshare --net bash
# Inside this shell, you're in an isolated namespace
ip link list # only shows lo (down)
ip addr # no addresses configured
Advanced Topologies: Network Namespaces as Routers
Network namespaces are not just for containers -- they are a powerful tool for building complex virtual network topologies for testing, development, and education. You can create namespaces that act as routers, firewalls, or NAT gateways:
# Simulate a router between two networks
ip netns add router
ip netns add lan1
ip netns add lan2
# Link lan1 to router
ip link add lan1-eth0 type veth peer name router-eth0
ip link set lan1-eth0 netns lan1
ip link set router-eth0 netns router
# Link lan2 to router
ip link add lan2-eth0 type veth peer name router-eth1
ip link set lan2-eth0 netns lan2
ip link set router-eth1 netns router
# Configure addresses
ip netns exec lan1 ip addr add 10.1.0.2/24 dev lan1-eth0
ip netns exec lan1 ip link set lan1-eth0 up
ip netns exec router ip addr add 10.1.0.1/24 dev router-eth0
ip netns exec router ip addr add 10.2.0.1/24 dev router-eth1
ip netns exec router ip link set router-eth0 up
ip netns exec router ip link set router-eth1 up
ip netns exec lan2 ip addr add 10.2.0.2/24 dev lan2-eth0
ip netns exec lan2 ip link set lan2-eth0 up
# Enable forwarding in the router namespace
ip netns exec router sysctl -w net.ipv4.ip_forward=1
# Add routes
ip netns exec lan1 ip route add default via 10.1.0.1
ip netns exec lan2 ip route add default via 10.2.0.1
# lan1 can now reach lan2 through the router
ip netns exec lan1 ping 10.2.0.2
This pattern is used extensively in network testing tools like netns-exec, lab environments like GNS3, and educational platforms. You can run BGP daemons (BIRD, FRR) inside namespaces to simulate multi-router topologies, test OSPF adjacencies, or reproduce complex routing scenarios -- all on a single Linux machine without any VMs or physical hardware.
Other Namespace Types and Their Interactions
Network namespaces do not exist in isolation -- they interact with other Linux namespace types that together form the foundation of containers:
- PID namespace -- Isolates the process ID space. Processes in a PID namespace see their own PID 1 and cannot see processes in other PID namespaces. This is orthogonal to network namespaces but combined in containers.
- Mount namespace -- Isolates the filesystem mount table. Containers use this to provide their own
/etc/resolv.conf,/etc/hosts, and/etc/hostnamewithout affecting the host. The mount namespace determines what the container sees at these paths, while the network namespace determines how DNS actually resolves. - User namespace -- Maps UID/GID between namespaces. A process can be root (UID 0) inside its user namespace while being an unprivileged user on the host. User namespaces are critical for rootless containers but add complexity to network namespace management because creating network interfaces typically requires host-level privileges.
- cgroup namespace -- Isolates the cgroup hierarchy view. Combined with network namespaces, cgroup-level eBPF hooks (used by Cilium) can be scoped to specific containers.
Containers are the combination of all these namespace types plus cgroups (for resource limits). A "container" is not a kernel primitive -- it is a userspace abstraction built from these individual isolation mechanisms. Understanding network namespaces in isolation helps you debug the networking component independent of filesystem, PID, or user isolation concerns.
Performance Considerations
Network namespaces add minimal overhead to the packet path. The main costs are:
- Veth pair traversal -- Each veth hop involves a context switch between namespaces. Benchmarks show roughly 1-3 microseconds of additional latency per veth pair compared to local communication. For typical container workloads, this is negligible.
- Bridge forwarding -- The Linux bridge performs MAC learning and table lookup for each frame. At high packet rates (millions of packets per second), bridge forwarding can become CPU-bound. This is one reason Cilium's eBPF approach (which uses
bpf_redirectto bypass the bridge entirely) outperforms traditional bridge-based CNIs. - Conntrack and iptables -- NAT and stateful firewalling add conntrack overhead per connection and iptables rule evaluation per packet. In clusters with many services, the iptables chain can become very long, adding measurable latency. This is a major motivator for eBPF-based and IPVS-based alternatives.
- Namespace count -- Linux can handle tens of thousands of network namespaces. The practical limit is usually memory for per-namespace data structures (routing tables, conntrack tables, iptables rules) rather than the namespaces themselves.
Network Namespaces and the Internet
Every container you run, every Kubernetes pod you deploy, and every network-isolated process on a Linux system ultimately communicates with the outside world through a chain of namespaces, virtual devices, and routing decisions. The traffic that exits the host's physical interface carries the host's IP address -- part of a subnet assigned by the network operator, within a prefix announced by an autonomous system via BGP. Use the god.ad looking glass to trace the path your container traffic takes through the global routing table.