Introduction
A core principle of Docker is that containers should be ephemeral and stateless. This means that any data created inside a container’s writable layer is lost forever when the container is removed. While this is great for creating predictable and clean environments, it poses a significant challenge for any application that needs to save data—like databases, user uploads, or logs.
To solve this, Docker provides three primary mechanisms for getting data into and out of containers: Volumes, Bind Mounts, and tmpfs mounts. This guide will provide a deep dive into each, with practical examples to help you understand when and how to use them effectively.
Why is Data Persistence So Crucial?
Imagine these common scenarios:
- You are running a PostgreSQL database in a container. If the container is removed, your entire database is wiped out.
- Your web application allows users to upload images. If those images are stored inside the container, they will be lost upon the next deployment.
- Your application generates critical log files. If you don’t persist them, you lose all diagnostic information when the container stops.
Volumes and bind mounts are the solution. They decouple the data from the container’s lifecycle by storing it on the host machine’s filesystem.
Part 1: Docker Volumes (The Preferred Method)
A volume is the most robust and recommended mechanism for persisting data in Docker. A volume is a directory on the host machine that is managed by Docker.
Key Characteristics of Volumes:
- Managed by Docker: Docker creates and manages the volume in a dedicated area of the host filesystem (e.g.,
/var/lib/docker/volumes/
on Linux). You don’t need to know the exact location. - Decoupled and Portable: Because Docker manages the volume, it’s decoupled from the host’s file structure, making your setup more portable.
- Safe to Share: Multiple containers can safely use the same volume simultaneously.
- Easy to Back Up and Migrate: Docker provides commands to manage volumes, making them easy to back up or migrate to other hosts.
Step 1: Creating and Inspecting a Named Volume
It’s a best practice to use named volumes.
-
Create a volume:
docker volume create my-app-data
-
List volumes:
docker volume ls
-
Inspect the volume: This command gives you details, including the actual location on the host (
Mountpoint
).docker volume inspect my-app-data
Step 2: Running a Container with a Volume
Let’s run an nginx
container and mount our volume to the directory that serves web content (/usr/share/nginx/html
).
We use the -v
or --volume
flag with the format <volume-name>:<path-in-container>
.
docker run -d --name my-nginx -p 8080:80 -v my-app-data:/usr/share/nginx/html nginx
-d
: Runs the container in detached (background) mode.--name my-nginx
: Names our container.-p 8080:80
: Maps port 8080 on the host to port 80 in the container.-v my-app-data:/usr/share/nginx/html
: This is the key part. It tells Docker to mount themy-app-data
volume to the/usr/share/nginx/html
directory inside the container.
Step 3: Modifying Data and Verifying Persistence
-
Write data to the volume from within the container:
docker exec my-nginx bash -c "echo '<h1>Data Persists with Volumes!</h1>' > /usr/share/nginx/html/index.html"
Now, open your browser to
http://localhost:8080
. You should see your message. -
Prove the data is persistent: Let’s remove the container and create a new one attached to the same volume.
# Stop and remove the container docker rm -f my-nginx # Run a new container with the same volume docker run -d --name new-nginx -p 8080:80 -v my-app-data:/usr/share/nginx/html nginx
Visit
http://localhost:8080
again. The message is still there! The data was safely stored in themy-app-data
volume on the host, completely unaffected by the container’s removal.
Part 2: Bind Mounts (For Development)
A bind mount is a direct mapping of a file or directory from the host machine into a container. Unlike volumes, you specify the exact path on the host.
Key Characteristics of Bind Mounts:
- Managed by the User: You control the source location on the host filesystem.
- Direct Access: The container gets direct access to the specified file/directory.
- High Performance: On Linux, performance is excellent.
- Ideal for Development: Perfect for mounting your source code into a container for live-reloading as you make changes.
Step 1: Create a Directory and File on Your Host
Create a project directory on your host and an index.html
file inside it.
mkdir ~/dev-project
cd ~/dev-project
echo "<h1>Live Code from the Host!</h1>" > index.html
Step 2: Run a Container with a Bind Mount
We use the same -v
flag, but this time the format is <absolute-path-on-host>:<path-in-container>
.
# Using $(pwd) ensures we provide an absolute path
docker run -d --name dev-nginx -p 8081:80 -v "$(pwd)":/usr/share/nginx/html nginx
Open your browser to http://localhost:8081
. You’ll see the message from your local index.html
.
Step 3: Experience Live Reloading
Now, modify the index.html
file on your host machine:
echo "<h1>The change is instant!</h1>" > index.html
Refresh your browser. The change appears immediately, without restarting or rebuilding anything. This is the magic of bind mounts for development.
Risks and Downsides of Bind Mounts
- Portability Issues: Your application now depends on a specific directory structure on the host, making it less portable.
- Security Risks: A compromised container could modify files on your host filesystem, which is a major security concern.
- Permission Headaches: File permissions (UID/GID) between the host and the container can become misaligned, leading to “Permission Denied” errors, especially on Linux.
Part 3: tmpfs Mounts (For Non-Persistent, In-Memory Data)
A tmpfs mount is a third option that stores data in the host’s memory, not on the filesystem. When the container stops, the tmpfs
mount is removed, and all data within it is gone.
Use Cases:
- Storing temporary, sensitive data that you don’t want written to disk.
- High-speed I/O for temporary files where performance is critical.
docker run -d --name temp-data-container --mount type=tmpfs,destination=/app/cache nginx
Any data written to /app/cache
inside this container will be extremely fast but will vanish the moment the container stops.
Comparison: Volumes vs. Bind Mounts vs. tmpfs
Feature | Volumes | Bind Mounts | tmpfs Mounts |
---|---|---|---|
Host Location | Docker-managed area (/var/lib/docker/volumes ) |
Any path on the host | Host’s memory |
Persistence | Yes, persists until docker volume rm |
Yes, data lives on the host | No, gone when container stops |
Primary Use Case | Production: Databases, logs, user data | Development: Source code, live-reloading | Temporary, non-persistent, sensitive data |
Portability | ✅ High | ❌ Low (tied to host path) | N/A |
Security | ✅ High (managed by Docker) | ⚠️ Medium (container can modify host) | ✅ High (no disk access) |
Management | docker volume CLI commands |
Standard filesystem commands | Managed by container lifecycle |
Conclusion: Which One Should You Use?
The choice is simple if you follow this rule:
- Volumes: The default and best choice for persisting data in production.
- Bind Mounts: The best choice for local development workflows.
- tmpfs Mounts: A special-purpose tool for high-speed, temporary, in-memory data.
By mastering these data persistence techniques, you can build robust, stateful, and production-ready applications with Docker.