1. Overview
etcd is a distributed, reliable key-value store for the most critical data of a distributed system. Based on the Raft consensus algorithm, it provides strong consistency and high availability.
In Kubernetes, etcd is the single source of truth. It stores:
- Cluster configuration (API objects).
- Actual state info (Which pods are running where).
- Service discovery and config sharing.
- Backups and recovery data.
2. Basic Operations (etcdctl)
- Member List:
$ etcdctl member list - Storing Data:
$ etcdctl put key value - Retrieving Data:
$ etcdctl get key - Watching:
$ etcdctl watch key(Triggers a notification when the key changes).
3. Leases and TTL
A Lease is a mechanism for expiring keys after a set time.
- Grant Lease:
$ etcdctl lease grant 60(Creates a 60s lease). - Bind Key:
$ etcdctl put k v --lease=[LEASE_ID] - Keep-alive:
$ etcdctl lease keep-alive [ID](Extends the TTL). - Revoke:
$ etcdctl lease revoke [ID](Deletes the lease and associated keys).
Leases are commonly used for Distributed Locking and Service Discovery (e.g., if a node crashes and stops sending heartbeats, its lease expires and its registration key is removed).
4. Snapshots and Backup
A snapshot captures the entire state of the etcd database.
- Save:
$ etcdctl snapshot save backup.db - Restore:
$ etcdctl snapshot restore backup.db
5. Running Etcd in K8s
For production, etcd usually runs on dedicated nodes. For development, you can use Helm to deploy it inside your cluster.
- Add Repo:
helm repo add bitnami https://charts.bitnami.com/bitnami - Install:
helm install my-etcd bitnami/etcd --set persistence.enabled=false - Verify: Use a client pod to test K-V operations.
$ kubectl exec -it my-etcd-0 -- etcdctl put /test "Hello"
Summary
The reliability of your entire Kubernetes cluster depends on etcd. Understanding its consensus model, watching capabilities, and backup procedures is vital for any cluster administrator.