Introduction

Setting up a production-ready Kubernetes cluster requires careful planning and configuration. This guide walks you through creating a multi-node cluster using kubeadm and containerd as the container runtime.

Prerequisites

Minimum Requirements:

  • 2 GB RAM per machine
  • 2 CPUs per machine
  • Network connectivity between machines
  • Unique hostname, MAC address, product_uuid
  • Swap disabled

Recommended for Production:

  • 3+ control plane nodes (HA)
  • 3+ worker nodes
  • 4 GB+ RAM per node
  • Load balancer for control plane

Architecture

Load Balancer (Optional for HA)
    ↓
Control Plane Nodes (3)
    ↓
Worker Nodes (3+)

Step 1: Prepare All Nodes

On all nodes (control plane + workers):

# Disable swap
sudo swapoff -a
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab

# Load kernel modules
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

sudo modprobe overlay
sudo modprobe br_netfilter

# Configure sysctl
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

sudo sysctl --system

Step 2: Install Containerd

# Install containerd
sudo apt-get update
sudo apt-get install -y containerd

# Configure containerd
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml

# Enable SystemdCgroup
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml

# Restart containerd
sudo systemctl restart containerd
sudo systemctl enable containerd

Step 3: Install Kubernetes Components

# Add Kubernetes repo
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl

curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.28/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list

# Install kubelet, kubeadm, kubectl
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

# Enable kubelet
sudo systemctl enable kubelet

Step 4: Initialize Control Plane

On first control plane node:

# Initialize cluster
sudo kubeadm init \
  --pod-network-cidr=10.244.0.0/16 \
  --apiserver-advertise-address=<CONTROL_PLANE_IP> \
  --control-plane-endpoint=<LOAD_BALANCER_IP>:6443

# Configure kubectl
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Step 5: Install CNI Plugin

Install Calico:

kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

Or Flannel:

kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml

Step 6: Join Worker Nodes

On each worker node:

# Use join command from kubeadm init output
sudo kubeadm join <CONTROL_PLANE_IP>:6443 \
  --token <TOKEN> \
  --discovery-token-ca-cert-hash sha256:<HASH>

If token expired, generate new:

# On control plane
kubeadm token create --print-join-command

Step 7: Verify Cluster

kubectl get nodes
kubectl get pods --all-namespaces
kubectl cluster-info

High Availability Setup

For HA control plane:

# On additional control plane nodes
sudo kubeadm join <LOAD_BALANCER_IP>:6443 \
  --token <TOKEN> \
  --discovery-token-ca-cert-hash sha256:<HASH> \
  --control-plane \
  --certificate-key <CERT_KEY>

Install Dashboard

kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml

# Create admin user
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin-user
  namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: admin-user
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: admin-user
  namespace: kubernetes-dashboard
EOF

# Get token
kubectl -n kubernetes-dashboard create token admin-user

# Access dashboard
kubectl proxy
# Visit: http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/

Production Best Practices

  1. Use HA control plane (3+ nodes)
  2. Enable RBAC
  3. Implement network policies
  4. Set up monitoring (Prometheus/Grafana)
  5. Configure backups (etcd)
  6. Use private registry
  7. Implement pod security policies
  8. Regular updates

Backup etcd

# Backup
ETCDCTL_API=3 etcdctl snapshot save snapshot.db \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key

# Restore
ETCDCTL_API=3 etcdctl snapshot restore snapshot.db

Troubleshooting

Node not ready:

kubectl describe node <node-name>
journalctl -u kubelet -f

Pod networking issues:

kubectl get pods -n kube-system
kubectl logs -n kube-system <calico-pod>

Certificate issues:

kubeadm certs check-expiration
kubeadm certs renew all

Upgrading Cluster

# Upgrade control plane
sudo apt-mark unhold kubeadm
sudo apt-get update
sudo apt-get install -y kubeadm=1.28.x-00
sudo apt-mark hold kubeadm

sudo kubeadm upgrade plan
sudo kubeadm upgrade apply v1.28.x

# Upgrade kubelet and kubectl
sudo apt-mark unhold kubelet kubectl
sudo apt-get install -y kubelet=1.28.x-00 kubectl=1.28.x-00
sudo apt-mark hold kubelet kubectl
sudo systemctl daemon-reload
sudo systemctl restart kubelet

Conclusion

A properly configured production cluster provides reliability, scalability, and security for your workloads.

Resources