Introduction

Not all workloads need to run continuously or be replicated across nodes. DaemonSets ensure one Pod per node (perfect for logging agents), while Jobs and CronJobs handle batch processing and scheduled tasks.

Understanding Different Workload Types

Kubernetes Workload Controllers:

Controller Purpose Replicas Lifecycle Use Case
Deployment Stateless apps Multiple Continuous Web servers, APIs
StatefulSet Stateful apps Multiple Continuous Databases, queues
DaemonSet Node services One per node Continuous Logging, monitoring
Job Batch tasks Configurable Run to completion Data processing
CronJob Scheduled tasks Configurable Scheduled Backups, reports

Why Different Controllers?

  • Deployments: For applications that can scale horizontally
  • DaemonSets: For node-level infrastructure services
  • Jobs: For one-time or batch processing tasks
  • CronJobs: For recurring scheduled operations

Part 1: DaemonSets - Running One Pod Per Node

What is a DaemonSet? A DaemonSet ensures that all (or some) nodes run a copy of a Pod. As nodes are added to the cluster, Pods are added to them. As nodes are removed, those Pods are garbage collected.

Why Use DaemonSets?

  • Node-Level Services: Every node needs the service (logging, monitoring)
  • Automatic Scaling: New nodes automatically get the Pod
  • Infrastructure Services: Network plugins, storage daemons
  • Cluster-Wide Operations: Security agents, performance monitoring

DaemonSet vs Deployment:

Feature DaemonSet Deployment
Pods per Node Exactly 1 Variable (based on replicas)
Scaling Automatic with nodes Manual or HPA
Node Selection All or selected nodes Scheduler decides
Use Case Node services Application services
Example Log collector Web application

How DaemonSets Work:

  1. DaemonSet controller watches for nodes
  2. Creates one Pod on each matching node
  3. If node added → Pod created automatically
  4. If node removed → Pod deleted automatically
  5. If Pod fails → Recreated on same node

Common Use Cases:

1. Log Collection:

  • Fluentd, Filebeat, Logstash
  • Collect logs from all nodes
  • Forward to centralized logging

2. Monitoring:

  • Prometheus Node Exporter
  • cAdvisor
  • Collect metrics from each node

3. Network:

  • Calico, Weave, Flannel
  • CNI plugins for networking
  • Run on every node

4. Storage:

  • Ceph, GlusterFS
  • Distributed storage daemons
  • Node-level storage services

5. Security:

  • Security agents
  • Vulnerability scanners
  • Compliance monitoring

Example 1: Basic DaemonSet (Log Collector)

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-system
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
    spec:
      # Tolerate master node taint
      tolerations:
      - key: node-role.kubernetes.io/control-plane
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluent/fluentd:v1.14
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: containers
          mountPath: /var/lib/docker/containers
          readOnly: true
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: containers
        hostPath:
          path: /var/lib/docker/containers

Example 2: DaemonSet on Specific Nodes (Monitoring)

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
spec:
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      labels:
        app: node-exporter
    spec:
      # Only run on nodes with monitoring label
      nodeSelector:
        monitoring: "true"
      hostNetwork: true  # Use host network
      hostPID: true      # Access host processes
      containers:
      - name: node-exporter
        image: prom/node-exporter:latest
        args:
        - --path.procfs=/host/proc
        - --path.sysfs=/host/sys
        ports:
        - containerPort: 9100
          hostPort: 9100
        volumeMounts:
        - name: proc
          mountPath: /host/proc
          readOnly: true
        - name: sys
          mountPath: /host/sys
          readOnly: true
      volumes:
      - name: proc
        hostPath:
          path: /proc
      - name: sys
        hostPath:
          path: /sys

Example 3: DaemonSet with Update Strategy

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: security-agent
spec:
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1  # Update one node at a time
  selector:
    matchLabels:
      app: security-agent
  template:
    metadata:
      labels:
        app: security-agent
    spec:
      containers:
      - name: agent
        image: security-agent:v2.0

DaemonSet Commands:

# List DaemonSets
kubectl get daemonsets
kubectl get ds  # Short form
kubectl get ds -A  # All namespaces

# Describe DaemonSet
kubectl describe daemonset fluentd

# Check which nodes have DaemonSet pods
kubectl get pods -o wide -l app=fluentd

# Update DaemonSet image
kubectl set image daemonset/fluentd fluentd=fluent/fluentd:v1.15

# Delete DaemonSet
kubectl delete daemonset fluentd

# Delete DaemonSet but keep pods
kubectl delete daemonset fluentd --cascade=orphan

Part 2: Jobs - Running Tasks to Completion

What is a Job? A Job creates one or more Pods and ensures that a specified number of them successfully terminate. Jobs track successful completions and retry failed Pods.

Why Use Jobs?

  • Batch Processing: Process large datasets
  • One-time Tasks: Database migrations, data imports
  • Parallel Processing: Distribute work across multiple Pods
  • Finite Workloads: Tasks that complete and exit

Job vs Deployment:

Feature Job Deployment
Lifecycle Run to completion Continuous
Restart On failure only Always
Success Criteria Completions count Always running
Use Case Batch tasks Long-running services

How Jobs Work:

  1. Job controller creates Pods
  2. Pods run until successful completion
  3. Failed Pods are retried (up to backoffLimit)
  4. Job completes when desired completions reached
  5. Pods remain for log inspection (unless cleaned up)

Job Patterns:

1. Simple Job (Single Completion)

apiVersion: batch/v1
kind: Job
metadata:
  name: pi-calculation
spec:
  template:
    spec:
      containers:
      - name: pi
        image: perl:5.34
        command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
      restartPolicy: Never  # Never or OnFailure
  backoffLimit: 4  # Retry up to 4 times

2. Parallel Jobs (Work Queue Pattern)

apiVersion: batch/v1
kind: Job
metadata:
  name: parallel-processing
spec:
  parallelism: 3        # Run 3 pods in parallel
  completions: 10       # Complete 10 tasks total
  template:
    spec:
      containers:
      - name: worker
        image: worker:latest
        command: ["./process-task.sh"]
      restartPolicy: Never

How it works:

  • Creates 3 Pods initially
  • As each Pod completes, new Pod starts
  • Continues until 10 successful completions

3. Parallel Jobs (Fixed Completion Count)

apiVersion: batch/v1
kind: Job
metadata:
  name: batch-processor
spec:
  parallelism: 5
  completions: 100
  template:
    spec:
      containers:
      - name: processor
        image: data-processor:v1
        env:
        - name: TASK_ID
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
      restartPolicy: OnFailure

4. Job with Timeout

apiVersion: batch/v1
kind: Job
metadata:
  name: timeout-job
spec:
  activeDeadlineSeconds: 300  # Fail after 5 minutes
  backoffLimit: 3
  template:
    spec:
      containers:
      - name: task
        image: long-running-task:latest
      restartPolicy: Never

5. Job with Resource Limits

apiVersion: batch/v1
kind: Job
metadata:
  name: resource-intensive-job
spec:
  template:
    spec:
      containers:
      - name: processor
        image: heavy-processor:latest
        resources:
          requests:
            memory: "2Gi"
            cpu: "1000m"
          limits:
            memory: "4Gi"
            cpu: "2000m"
      restartPolicy: Never

Job Configuration Options:

Field Description Default
completions Number of successful completions needed 1
parallelism Max pods running in parallel 1
backoffLimit Number of retries before marking failed 6
activeDeadlineSeconds Max time job can run None
ttlSecondsAfterFinished Auto-delete after completion None

Job Commands:

# List jobs
kubectl get jobs
kubectl get jobs -w  # Watch

# Describe job
kubectl describe job pi-calculation

# View logs
kubectl logs job/pi-calculation
kubectl logs -f job/pi-calculation  # Follow

# Check job status
kubectl get job pi-calculation -o yaml

# Delete job
kubectl delete job pi-calculation

# Delete job and pods
kubectl delete job pi-calculation --cascade=foreground

# Auto-cleanup completed jobs (add to spec)
ttlSecondsAfterFinished: 100

Part 3: CronJobs - Scheduled Jobs

What is a CronJob? A CronJob creates Jobs on a repeating schedule. It’s like cron in Linux but for Kubernetes Jobs.

Why Use CronJobs?

  • Scheduled Backups: Database, file backups
  • Report Generation: Daily/weekly reports
  • Data Cleanup: Remove old data periodically
  • Health Checks: Periodic system checks
  • Batch Processing: Scheduled data processing

CronJob vs Job:

Feature CronJob Job
Execution Scheduled One-time
Trigger Time-based Manual
Recurrence Repeating Single
Use Case Backups, reports Migrations, imports

Basic CronJob:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: backup-job
spec:
  schedule: "0 2 * * *"  # Every day at 2 AM
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: backup-tool:latest
            command: ["/bin/sh", "-c", "backup-script.sh"]
          restartPolicy: OnFailure

Cron Schedule Examples:

*/5 * * * *     # Every 5 minutes
0 */2 * * *     # Every 2 hours
0 0 * * 0       # Every Sunday at midnight
0 0 1 * *       # First day of month
0 9-17 * * 1-5  # 9 AM to 5 PM, Monday-Friday

Advanced CronJob:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: database-backup
spec:
  schedule: "0 2 * * *"
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 1
  concurrencyPolicy: Forbid  # Don't run if previous still running
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: postgres:14
            env:
            - name: PGPASSWORD
              valueFrom:
                secretKeyRef:
                  name: db-secret
                  key: password
            command:
            - /bin/sh
            - -c
            - pg_dump -h db-host -U postgres mydb > /backup/backup-$(date +%Y%m%d).sql
            volumeMounts:
            - name: backup-volume
              mountPath: /backup
          volumes:
          - name: backup-volume
            persistentVolumeClaim:
              claimName: backup-pvc
          restartPolicy: OnFailure

Concurrency Policies:

  • Allow: Allow concurrent jobs
  • Forbid: Skip if previous still running
  • Replace: Cancel previous, start new

Commands:

kubectl get cronjobs
kubectl describe cronjob backup-job
kubectl get jobs --watch
kubectl delete cronjob backup-job

Production Examples

Log Collection DaemonSet:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd-elasticsearch
  namespace: kube-system
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
    spec:
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch
        env:
        - name: FLUENT_ELASTICSEARCH_HOST
          value: "elasticsearch.logging"
        - name: FLUENT_ELASTICSEARCH_PORT
          value: "9200"
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

Database Backup CronJob:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: postgres-backup
spec:
  schedule: "0 3 * * *"
  successfulJobsHistoryLimit: 7
  failedJobsHistoryLimit: 3
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: postgres:14-alpine
            env:
            - name: PGHOST
              value: postgres-service
            - name: PGUSER
              value: postgres
            - name: PGPASSWORD
              valueFrom:
                secretKeyRef:
                  name: postgres-secret
                  key: password
            command:
            - /bin/sh
            - -c
            - |
              BACKUP_FILE="/backup/db-$(date +%Y%m%d-%H%M%S).sql.gz"
              pg_dump mydb | gzip > $BACKUP_FILE
              echo "Backup completed: $BACKUP_FILE"
              # Keep only last 7 days
              find /backup -name "db-*.sql.gz" -mtime +7 -delete
            volumeMounts:
            - name: backup
              mountPath: /backup
          volumes:
          - name: backup
            persistentVolumeClaim:
              claimName: backup-pvc
          restartPolicy: OnFailure

Best Practices

DaemonSets:

  1. Use for node-level services only
  2. Set resource limits
  3. Use tolerations for master nodes if needed
  4. Monitor DaemonSet health

Jobs:

  1. Set backoffLimit appropriately
  2. Use activeDeadlineSeconds for timeouts
  3. Clean up completed jobs
  4. Use parallelism for batch processing

CronJobs:

  1. Set history limits
  2. Use concurrencyPolicy wisely
  3. Test schedules before production
  4. Monitor job failures
  5. Implement idempotency

Troubleshooting

DaemonSet not on all nodes:

kubectl describe daemonset fluentd
# Check: Node selectors, taints, resource constraints

Job not completing:

kubectl describe job my-job
kubectl logs job/my-job
# Check: Container errors, resource limits, backoffLimit

CronJob not running:

kubectl describe cronjob backup-job
kubectl get jobs
# Check: Schedule syntax, concurrency policy, suspended status

Conclusion

DaemonSets, Jobs, and CronJobs handle specialized workloads:

  • DaemonSets: Node-level services
  • Jobs: One-time batch tasks
  • CronJobs: Scheduled tasks

Next: Kubernetes Ingress

Resources