Docker for Development
Below we lay out recommendations for using Docker to wrap development dependencies. Unfortunately, this is a relatively new space and we don't have a consistent, end-to-end setup at the moment. Instead, we'll provide pointers, partial solutions, and examples. We intend to flesh out complete solutions as they become apparent in addition to a comprehensive introduction.
Why?
Consider your app's production environment. You're likely running on cloud.gov (which runs Ubuntu Linux) with specific versions of databases, app language (e.g. Go/Node/Python/Ruby), and required libraries (you are pinning your dependencies, right?). Now consider your local environment; if you're an TTS engineer, you likely have OS X, language version management tools, and an assortment of strategies for running databases, worker queues, etc. If you're a designer, open source contributor, or working at a partner agency, we should assume even less. Not only do we have significant mismatches between these environments, we also require significant setup for new and infrequent contributors.
Docker promises a partial solution to this problem by wrapping dependencies into a consistent, reproducible environment. While we don't generally support Docker in production, we can create a setup that matches cloud.gov relatively closely and which makes running our app painless.
Within GSA, Docker Desktop can be installed through Self Service without admin rights. This allows people doing "light" development (like editing content) to run the site locally. Use of Docker can also hide the complexity of setting up a development environment from them.
Recommendations
We hope to have an end-to-end recommendation in the future, but for now we provide these scoped recommendations.
Use Docker-Compose
Docker introductions tend to highlight the
Dockerfile
, a
procedural description of how to build a Docker image, but that's generally
not encompassing enough for development needs. We will likely want to run a
database (or other service) separate from our application, and provide
different environments (e.g. one for building, one for development, one for
production-like); while it's possible to include all of that in a single
image, it's generally better to use separate, focused images. Moreover, while
Dockerfiles provide hints around which ports should be exposed, which files
should be shared with the host, etc., the docker-compose file instantiates
specific values. This is needed when we run the app locally.
Docker-compose files solve
both of these problems by allowing us to specify different services,
dependencies between those services, and specific values for exposed ports,
mount points, etc. Though it feels more complex at first, we recommend using
docker-compose even when a single Dockerfile would suffice so that we have a
consistent tool across TTS engineering. One of the bigger "aha" moments you
will encounter is transitioning from thinking of Docker as a VM build script
(a la vagrant
) to thinking of it as a way to configure ("orchestrate")
several, single-purpose containers (i.e. microservices).
Base-images
There's a balance to be struck between mirroring cloud.gov (e.g. through the
cflinuxfs3 base image based on Ubuntu 18.04 LTS, aka Bionic Beaver)
and using official Docker images for Node,
Postgres, Nginx, etc. We're waiting on Docker-based buildpacks to stabilize, but in the meantime we
recommend using the Ubuntu build (where available) of the official images,
such as python:3.5
, ruby:2.4
, etc. Avoid the slim
and alpine
images as
they aren't likely to closely match the production environment.
Mounting the working directory
Demanding the developers rebuild their Docker images after every modification won't fly and isn't necessary. Instead, we'll want to mount the source code into a shared volume, meaning edits on the host file system are visible within the container and vice versa. This will likely look something like:
# docker-compose.yml
my-service:
volumes:
- $PWD:/usr/src/app
working_dir: /usr/src/app
...
where /usr/src/app
is the mount point within the container (different Docker
images will assume the project's code is at different spots within the file
system; be sure to verify).
By doing this, you guarantee that the container sees your changes without needing to rebuild the Docker image (which can take a considerable amount of time).
Does this mean that your Dockerfile shouldn't include the application's source code? It certainly means that whatever the container had stored at the mount point is ignored during local development. That doesn't rule out all of the benefits of including a checkpoint of the application source within the docker image, however. See Are Dockerfiles needed? for further discussion.
Matching cloud.gov
We'd like our local environment to match production, so let's mirror our cloud.gov manifest files in docker-compose.yml. This way, we'll be accessing our configuration variables the same way (e.g. via cf-env) in our two environments.
Adding fixed environment variables (e.g. DJANGO_SETTINGS_MODULE
) is straight
forward, just add an environment
key to your service definition. We'll also
want to include settings provided within cloud.gov like PORT
,
DATABASE_URL
, and (potentially) TMPDIR
. See the full
list
from Cloud Foundry. We'll also want to duplicate the serialized JSON of the
VCAP_APPLICATION
and VCAP_SERVICES
variables, which is pretty straight
forward using a "folded" string literal (>
):
my-service:
environment:
DATABASE_URL: postgres://postgres@my-database-service/postgres
PORT: 8000
SOMETHING: Else
TMPDIR: /tmp
VCAP_APPLICATION: >
{"uris": ["localhost", "0.0.0.0", "127.0.0.1"]}
VCAP_SERVICES: >
{"my-user-provided-service": [{"credentials": {"SOME_KEY": "VALUE"}}],
"my-elastic-service": [{"more": "settings"}]}
Jekyll
The Docker configuration from the Handbook is very copy-able for other Jekyll sites. Specifically, see the:
For further debate
As with any new tool, we need time to derive best practices. Below we catalog a few of the more important debates and solutions we're actively using. We'll see which of the competing strategies resonates the most at a future date.
Library storage
One of our primary needs from Docker is to wrap all of our application's runtime (and development) libraries -- how do we do that?
One strategy would be to create a Dockerfile which instructed Docker to
include a description of these libraries (requirements.txt
, package.json
,
Gemfile.lock
, etc.) and to download those requirements into the constructed
Docker image (e.g. via pip install
, gem install
, npm install -g
). Any
commands which use this constructed image would have the libraries already at
their disposal. The benefit to this approach is that we could ship this image
around and all users would have access to the full environment (this is the
promise of Docker in production). The cost is that every change to our
requirements forces us to rebuild our images.
A different strategy would store the libraries somewhere in the current
directory, as we've come to expect from npm install
and node_modules
. This
can be recreated in other languages through careful configuration of
virtualenv
s, bundler
, etc. In this scenario, we don't need to worry about
rebuilding images based on requirements (which are effectively dynamically
included). Further, we can destroy our Docker containers and images with
little fear as all the libraries are actually stored in the host's file
system. This approach requires a bit more planning and configuration and
diverges from the majority of Docker's use cases (meaning documentation is
harder to find). It also has confusion points around Linux binaries being
stored on OS X/Windows/etc.
As a third strategy, we can try storing these libraries within a Docker volume, meaning they stay with the project between restarts but are not (generally) accessible from the host. We'd more commonly use volumes like this for database storage, but there's nothing to stop us from storing our runtime libraries the same way. This approach has the benefit of being very flexible -- our docker-compose configurations aren't closely tied to our specific projects, but requires significant planning and configuration.
One-off commands
Apps generally require more than a single "run-everything" command -- we'll want to get to the database console, run tests, migrate data, etc. etc. We have several strategies at our disposal.
Before we dive too deep, it's important to first discuss docker-compose run
.
Consider
docker-compose run --rm my-service py.test --pdb
This starts your my-service
(as defined in your docker-compose manifest),
including any necessary dependencies, such as databases. It doesn't execute
my-service
's startup command, however; instead it runs py.test
within the
my-service
container. Once that command finished, Docker stops and deletes
(--rm
) the running container (any services it depended on will continue to
run).
As a first approach, we can run these one-off commands directly as described above. This works well, but requires engineers keep track of the relevant commands and which image they should be ran within. We can soften the burden by writing wrapping shell scripts or command aliases.
A second strategy places those commands within the docker-compose manifest as pseudo-services, e.g.
services:
py.test:
image: thing-that-contains-pytest
volumes:
- $PWD:/apps-dir
entrypoint: py.test
These would be executed via
docker-compose run --rm py.test --pdb
This approach defines a concise list of the entry points to your software
suite, but may require additional image rebuilding and can be confusing when
combined with docker-compose up
. If taking this approach, be sure to use
YAML anchors to reduce duplication (search within this
example for more details).
If using Django, Atul has written a manage.py library which intelligently selects between docker-compose and the local environment. If we can describe all of our one-time commands via Django, this tactic may get you going faster.
Are Dockerfiles needed?
Describing our application setup in a Dockerfile is a great way to create a standard platform for the team. We aren't actually shipping this image around, however, and, in many cases, we're overriding much of the work it performed (e.g. by replacing the working directory with the local version of the files). Can we get by without an application image, then?
For example, consider a docker-compose manifest that referred only to official images but shared a Docker volume:
services:
my-app:
image: python:3.5
volumes:
- dependencies:/path/to/dependency/storage
volumes:
dependencies:
Then we could execute all of our application setup without an application image:
docker-compose run --rm my-app pip install
docker-compose run --rm my-app gunicorn # start app
This approach isn't described often (it seems the community is still settling), but it's worth considering.
Examples
Docker as primary dev env
Docker as alternative dev env
Docker Hub
TTS has a couple of organizations in Docker Hub:
Additional reading
- Atul's Reflections on Docker-based development
- OMB eRegs' Resolving common container issues