Introduction
Setting up a production-ready Kubernetes cluster requires careful planning and configuration. This guide walks you through creating a multi-node cluster using kubeadm and containerd as the container runtime.
Prerequisites
Minimum Requirements:
- 2 GB RAM per machine
- 2 CPUs per machine
- Network connectivity between machines
- Unique hostname, MAC address, product_uuid
- Swap disabled
Recommended for Production:
- 3+ control plane nodes (HA)
- 3+ worker nodes
- 4 GB+ RAM per node
- Load balancer for control plane
Architecture
Load Balancer (Optional for HA)
↓
Control Plane Nodes (3)
↓
Worker Nodes (3+)
Step 1: Prepare All Nodes
On all nodes (control plane + workers):
# Disable swap
sudo swapoff -a
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
# Load kernel modules
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
# Configure sysctl
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
sudo sysctl --system
Step 2: Install Containerd
# Install containerd
sudo apt-get update
sudo apt-get install -y containerd
# Configure containerd
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
# Enable SystemdCgroup
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
# Restart containerd
sudo systemctl restart containerd
sudo systemctl enable containerd
Step 3: Install Kubernetes Components
# Add Kubernetes repo
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.28/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
# Install kubelet, kubeadm, kubectl
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
# Enable kubelet
sudo systemctl enable kubelet
Step 4: Initialize Control Plane
On first control plane node:
# Initialize cluster
sudo kubeadm init \
--pod-network-cidr=10.244.0.0/16 \
--apiserver-advertise-address=<CONTROL_PLANE_IP> \
--control-plane-endpoint=<LOAD_BALANCER_IP>:6443
# Configure kubectl
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Step 5: Install CNI Plugin
Install Calico:
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
Or Flannel:
kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
Step 6: Join Worker Nodes
On each worker node:
# Use join command from kubeadm init output
sudo kubeadm join <CONTROL_PLANE_IP>:6443 \
--token <TOKEN> \
--discovery-token-ca-cert-hash sha256:<HASH>
If token expired, generate new:
# On control plane
kubeadm token create --print-join-command
Step 7: Verify Cluster
kubectl get nodes
kubectl get pods --all-namespaces
kubectl cluster-info
High Availability Setup
For HA control plane:
# On additional control plane nodes
sudo kubeadm join <LOAD_BALANCER_IP>:6443 \
--token <TOKEN> \
--discovery-token-ca-cert-hash sha256:<HASH> \
--control-plane \
--certificate-key <CERT_KEY>
Install Dashboard
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml
# Create admin user
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kubernetes-dashboard
EOF
# Get token
kubectl -n kubernetes-dashboard create token admin-user
# Access dashboard
kubectl proxy
# Visit: http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
Production Best Practices
- Use HA control plane (3+ nodes)
- Enable RBAC
- Implement network policies
- Set up monitoring (Prometheus/Grafana)
- Configure backups (etcd)
- Use private registry
- Implement pod security policies
- Regular updates
Backup etcd
# Backup
ETCDCTL_API=3 etcdctl snapshot save snapshot.db \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key
# Restore
ETCDCTL_API=3 etcdctl snapshot restore snapshot.db
Troubleshooting
Node not ready:
kubectl describe node <node-name>
journalctl -u kubelet -f
Pod networking issues:
kubectl get pods -n kube-system
kubectl logs -n kube-system <calico-pod>
Certificate issues:
kubeadm certs check-expiration
kubeadm certs renew all
Upgrading Cluster
# Upgrade control plane
sudo apt-mark unhold kubeadm
sudo apt-get update
sudo apt-get install -y kubeadm=1.28.x-00
sudo apt-mark hold kubeadm
sudo kubeadm upgrade plan
sudo kubeadm upgrade apply v1.28.x
# Upgrade kubelet and kubectl
sudo apt-mark unhold kubelet kubectl
sudo apt-get install -y kubelet=1.28.x-00 kubectl=1.28.x-00
sudo apt-mark hold kubelet kubectl
sudo systemctl daemon-reload
sudo systemctl restart kubelet
Conclusion
A properly configured production cluster provides reliability, scalability, and security for your workloads.