Keeping State and Networking in Kubernetes

266

In our previous installments of this series (see below), we learned a lot of neat things about Kubernetes. We learned that it is descended from the secret Google Borg project, its architecture, and why it is a good choice for your datacenter. Now we’ll learn how Kubernetes keeps state with etcd, and how normal Linux networking ties everything together.

Key-Value Stores

Kubernetes needs a persistency layer to track the state of the cluster over time. Traditionally, this could be implemented with a relational database. However, in a highly scalable system, a relational database (e.g., MySQL PostgreSQL) becomes a single point of failure. Distributed key-value stores are, by design, made to run on multiple nodes. Data is replicated among the nodes and has strong consistency, so that when any individual nodes fail the data store does not fail. Zookeeper, Consul, and etcd are all examples of distributed key-value stores.

Kubernetes uses etcd. etcd can be run on a single node, though this provides no fault-tolerance. etcd uses a leader election algorithm to provide strong consistency of the stored state among the nodes.

In a test setup on the master node, we also run a single node etcd key-value store. We can check its content with the etcdctl command and see what Kubernetes is storing in it:

$ systemctl -a | grep etcd etcd2.service loaded active running
etcd2

$ etcdctl ls /registry
/registry/ranges
/registry/namespaces
/registry/serviceaccounts
/registry/controllers
/registry/secrets
/registry/pods
/registry/deployments
/registry/services
/registry/events
/registry/minions
/registry/replicasets

This gives you a sneak peek at some of the Kubernetes resources.

Networking Setup

Getting all the previous components running is a common task for system administrators who are used to configuration management. But to get a fully functional Kubernetes cluster, the network must be setup properly as well.

If you have deployed virtual machines (VMs) based on IaaS solutions, this will sound familiar. Containers running on all the nodes will attach to a Linux bridge. This bridge is configured to give IP addresses in a specific subnet, and that subnet is routed to all the other nodes. In essence, you need to treat a container just like a VM. All the containers started on any nodes need to be able to reach each other.

You can see the detailed explanation about this model at Cluster Networking. The only caveat is that in Kubernetes the lowest compute unit is not a container, but what we call a pod. A pod is a group of co-located containers that share the same IP address.

Kubernetes expects this network configuration to be available. It is not created automatically, so you have to set it up. You can configure your physical network, or use a software-defined overlay such as Weave, Flannel, or Calico.

Tim Hockin, one of the lead Kubernetes developers, has created a useful slide deck,  Illustrated Guide To Kubernetes Networking, to help understand Kubernetes networking.

Download the sample chapter now.

Kubernetes Fundamentals

You may enjoy the previous entries in this series: