Docker is an open platform for developers to build, ship and run distributed applications. Delivering such applications in lightweight containers brings robustness, predictability and repeatability to environments, from the developer’s desktop, to continuous integration, to production.
In the last couple of years, we have seen staggering growth with big companies such as Google, Amazon and Microsoft all announcing platforms and services based on Docker. A swarm of new startups is also offering innovative container solutions.
In this article we will focus on how Docker has helped us support better build and release processes. We will show how we built a continuous deployment platform to allow for faster, more frequent and more reliable deployment of our software projects.
A brief introduction to Docker
The Docker  daemon relies on specific Linux features. It can be installed on a Linux host or on either Windows or Mac OS X in a provided Virtualbox virtual machine (VM), Boot2Docker. Native clients are available for all systems.
Docker leverages a plain text configuration language to describe the process of building an image: what software should be installed, what files should be added, what network ports should be open, what process should be launched to start the container.
Let’s have a look at our first Dockerfile. In this example we will describe building an image for the Nginx web server.
# Pull base image
# Install Nginx
RUN apt-get update && \
apt-get install -y nginx && \
echo "\ndaemon off;" >> /etc/nginx/nginx.conf
# Expose ports
EXPOSE 80 433
# Define default command
Save the contents of this file in an empty directory. From this directory, we can build the erni/nginx image:
docker build -t "erni/nginx" .
And start a container based on this image:
docker run -d -p 80:80 --name "nginx" erni/nginx
What we just did is launch a container based on the “erni/nginx” image in daemon mode (-d), bind the port 80 on the container to the port 80 on the host, and assign this container the name “nginx” so that we can reference it later. Let’s start a couple more containers:
docker run -d -p 81:80 --name "nginx1" erni/nginx
docker run -d -p 82:80 --name "nginx2" erni/nginx
The first thing we can notice is that it only takes a split second to start new Nginx containers. This feature is possible as we leverage our nginx image, and starting a new environment is just a matter of running a single new isolated process. The Docker command gives us the ability to list all running containers (docker ps) and administer them from the command line (stop, start, get statistics)...
jtremeaux@erni-trje:~/nginx$ docker ps
|2972f47f25a6||erni/nginx:latest||nginx||16 seconds ago||Up 15 seconds||443/tcp, 0.0.0.0:82->80/tcp||nginx3|
|799d3218ba24||erni/nginx:latest||nginx||23 seconds ago||Up 22 seconds||443/tcp, 0.0.0.0:81->80/tcp||nginx2|
|7fc24d7610de||erni/nginx:latest||nginx||49 seconds ago||Up 48 seconds||443/tcp, 0.0.0.0:80->80/tcp||nginx|
What about the data?
So far we’ve booted a working web server, but it only serves the default web page. Not very helpful, is it? The images are immutable, and running a new container will start with exactly the same filesystem that has been built into the image.
There are several solutions available for adding content to your web server, the one you’ll use will depend on your specific case.
In our case, what we’ll do for our web server is create a new (also immutable) image based on our previous erni/nginx image. With an extra line in the Dockerfile, we create an additional layered FS containing the HTML resources. This is achieved with the ADD <src> <dest> instruction. All files present src directory (beneath the Dockerfile) are copied to the dest directory in the resulting image.
Alternately, Docker allows us to map a directory from the host onto a directory in the container. This is accomplished by adding the switch -v <host_dir>:<container_dir> when running a container. This should be done to add specific resources to the host or for debugging purposes, and should not be used routinely as it breaks the encapsulation of containers.
The third and last solution is to use the VOLUME instruction inside of the Dockerfile. A volume is a directory that is persistent even after the container is stopped. It means that you can access the data from a new container even after the image is updated. The common use for volumes is to store database contents. Volumes can also be mounted from multiple concurrent containers, e.g. to build a backup server.
Why use Docker?
Write once, deploy everywhere
Docker has allowed us to run all kind of distributed software, from pre-packaged (databases, CRM, ECM)… to projects that we built for our clients. As long as you can write a series of commands to start your application, it can always be run with Docker.
Containers vs VMs
Docker leverages LXC (Linux Containers) and union file systems with copy-on-write (Like AuFS) to create a new kind of packaging and process isolation. Containers are run in an isolated environment, but share OS resources.
Docker Images are built in layers, so for instance when we created our Nginx image previously, we built one layer for the base image (Debian/Jessie) and another one on top for the Nginx package. Because there is no guest OS and layers can be reused, building an image is very cheap. Layers are typically in the 100s of MB vs 10s of GB for Virtual Machines.
Containers and VMs are not mutually exclusive; Docker is just a regular service that can be run inside a VM.
A virtualized system typically takes minutes to start. Docker, through creating a container for the process, and not booting up a full OS, brings it down to seconds, and sometimes even less than a second. The burden of deployment is reduced and the feedback loop is shortened.
The immutable nature of Docker images gives us the peace of mind that things will work exactly the way they are supposed to work and have been working. If a container is working on the developer’s desktop, we can be confident that it will work in the same way later down the deployment pipeline. If it doesn’t work, it is easy to trace why because we have the explicit instructions to reproduce the environment.
Docker best practices
This is a short list of best practices that we’ve come up with based on our experience with Docker.
Put your Dockerfiles under Version Control
Eventually you’ll have to make a modification to a running container. Your builds should be reproducible and fixable (by someone else)… As with any programming language, we recommend the benefits of a VCS to collaborate and track the changes to your builds.
Use your own base images
We often have to set up similar environments across our projects: using the same technological stack, setting the default locale and time zone, fixing security issues...The concept of base images is the means by which Docker achieves reuse of configurations. We typically create our own base images for common versions of runtime environments, application containers, etc.
Create the smallest possible containers
The Docker platform allows a natural path towards a microservices architecture , which means designing your application as a set of independently deployable features. Think of containers as small distributed pieces of your system which can be easily linked together. To reduce complexity and reduce build time, your Dockerfiles should avoid installing unnecessary packages. As a rule of thumb, run only one process per container.
Use data volume containers
Building upon the VOLUME instruction, it is easy to segregate application data inside separate containers, and attach them to the application process containers at runtime with the --volume_from=data_container switch. While this represents a shift in mindset from storing the data on the host, such data-only containers are the best practice as they keep data volumes visible (where is my data? how do I back it up?), that’s why we use them systematically for persistent data.
Continuous delivery with Git + Jenkins + Docker
Our goal while building a continuous delivery platform was to create a generic solution that we can easily set up throughout our projects. We tried to achieve something similar to cloud platforms (PaaS) such as Heroku, while keeping total control of the resources, build and deployment processes. From the developer’s perspective, deploying an application means merging and pushing a branch from his IDE.
The following chart depicts the architecture of our continuous delivery system:
Step 1. The developer merges his changes into a branch with a specific name (e.g. “integration”) and pushes his changes to the remote. Depending on the projects, we have branches for various stages like integration, validation and production.
Step 2. A post-commit hook on Git triggers a build on Jenkins CI.
Step 3 (optional). If the build is successful and passes the tests, the artifacts (.jar)... are published on the Nexus Repository Manager.
Step 4. Jenkins builds an updated Docker image from the Dockerfile + artifacts (docker build…)
Step 5. Jenkins removes the previous container and starts a new one with this image (docker rm… & docker start…)
Automated reverse proxy with Haproxy + Dockergen
When running web services within Docker containers, it is often useful to run a reverse proxy in front of them to make them accessible. Configuring such a proxy requires specific knowledge of the network configuration and proxy software, which is not always accessible to the developers. A dockerized reverse proxy can help with these issues as well as improve availability by facilitating zero-downtime, zero-configuration deployments.
To achieve this, we use an additional program called Dockergen . By adding a switch to our container (-v /var/run/docker.sock:/tmp/docker.sock), Dockergen is able to pick up events from the containers (starting or stopping a container, reading its environment variables), their runtime environment, and generate a new configuration for Haproxy from a template file haproxy.cfg.tmpl. What is noteworthy here is that while the reverse proxy sits quietly waiting for Docker events, the real configuration is done from the proxyfied applications by adding a single environment variable (-e VIRTUAL_HOST=www.my_app.ch, or -e SECURE_VIRTUAL_HOST=www.my_secure_app.ch) to its runtime parameters. At no time do the developers need to worry about the intricacies of modifying reverse proxy configurations.
In this article, we’ve barely scratched the surface of what is possible to achieve with Docker. While we’ve already achieved a sophisticated solution for continuous delivery in the integration and validation stages, containerized solutions can also be pushed further down the release pipeline to help with issues such as availability and scalability.