Wednesday, November 16, 2022
Dev Container Setup for Python
I’ve written a lot recently about Swift, but Python remains the programming language that I use the most. And given that Visual Studio Code’s Dev Containers feature has become my favorite way to manage my development environments, it occurred to me that I should detail my Dev Container setup for Python. As a reminder, Dev Containers allow you to use one or more Docker containers as a development environment, enabling you to have a fully isolated, similar-to-production, Linux-based environment in which to code. And all of the configuration files can be committed to your repository, allowing all team members to consistently use the same environment.
First, The Inevitable Aside on Python Packaging
One aspect of Python that many developers find extremely compelling is the vibrant and active community of third-party, open source packages. If you ever find yourself running into a problem and thinking, “Surely someone out there has figured this out before,” there’s a very good chance that an installable package is available that solves it for you. Indeed, at the time this post was written, there are 415,359 packages available in Python’s canonical package repository, the Python Package Index (PyPI).1
Unforunately, while Python has long had this cornucopia of installable packages, the
details of installing those packages has been... not great. By default, packages are
installed in a directory that is global to Python; any Python project you use or work on
accesses the same global set of packages. That’s fine until you happen to work on
multiple projects with conflicting dependencies. So then came virtual environments
which allowed you to have separate pockets of Python on your system, each with their own
set of packages. But virtual enviroments didn’t help you keep track of the packages you
needed to install, so along came pip-tools
,
which would write out your dependencies to a text file. But pip-tools
didn’t help you
manage your virtual environments or help you manage conflicts among your dependencies’
dependencies so along came pipenv
and
poetry
to help you with all of that.
You might be forgiven for thinking that none of this packaging messiness applies to Dev Containers, since they give each project a fully isolated virtual machine. And indeed, it is possible to simply use the global Python environment inside each project’s Dev Container. I have found this solution to be lacking, however, for the following reasons:
-
I invaribly run into permissions problems trying to install in the Dev Container’s global Python environment as it is owned by
root
while development usually happens under thevscode
user. This can be worked around, but it tends to leave things in a quirky state. -
It is still good practice to track your project’s dependencies and to pin known compatible verions of them in a way that lends itself to reliable replication. And it is still good practice to update your dependencies and your dependencies’ dependencies on a regular basis. The
pipenv
andpoetry
tools were designed for just this sort of thing and they excel at it. -
Other developers on your team may not be using Dev Containers, and your production environment certainly won’t, so you still need good dependency hygiene even outside of a Dev Container context.
All of this is to say that I still recommend using virtual environments and dependency
managers inside of Dev Containers. More specifically, I strongly recommend using either
pipenv
or poetry
to manage your dependency tree. They are both excellent, so it is
difficult to universally recommend one over the other. That said, poetry
has adopted
the pyproject.toml
file as the way
to record top-level dependencies. pyproject.toml
has become the recommended way to
record a project’s metadata, so these days I tend to use poetry
.
Now, the Actual Dev Container Setup
First, the docker-compose.yml
file creates our overall container environment:
version: '3.8'
services:
app:
build:
context: ..
dockerfile: .devcontainer/Dockerfile
args:
VARIANT: "3.10"
NODE_VERSION: "none"
volumes:
- ..:/workspace:cached
command: sleep infinity
network_mode: service:db
user: vscode
env_file: .env
db:
image: postgres:14.5
restart: unless-stopped
env_file: postgres.env
volumes:
postgres-data:
Based mostly on Microsoft’s
default,
this compose file specifies 3.10
as the target version of Python2 and avoids
installing Node. Some key differences from
the default version of the file:
- I specify an environment file from which environment variables will be set when the
container builds. This file —
.devcontainer/.env
— must be present in the.devcontainer
folder and valid or else the container will fail to build. - I prefer to pin the version of PostgreSQL I use rather
than just
latest
, as I want to use the same version that is in my production environment. - I also use an environment file —
.devcontainer/postgres.env
— to specify the database username and password. Be sure to add*.env
to your.gitignore
file so you don’t accidentally include the environment files in your repository.
The contents of your .env
file will depend on the kind of project you are working on;
you might not even need it at all. The postgres.env
is rather simple:
POSTGRES_DB={{ preferred DB name }}
POSTGRES_USER={{ preferred DB username }}
POSTGRES_PASSWORD={{ preferred DB password }}
Next, the Dockerfile
for the main development container:
ARG VARIANT=3
FROM mcr.microsoft.com/vscode/devcontainers/python:0-${VARIANT}
ENV PYTHONUNBUFFERED 1
ARG NODE_VERSION="none"
RUN if [ "${NODE_VERSION}" != "none" ]; then su vscode -c "umask 0002 && . /usr/local/share/nvm/nvm.sh && nvm install ${NODE_VERSION} 2>&1"; fi
RUN apt-get update && export DEBIAN_FRONTEND=noninteractive \
&& apt-get -y install --no-install-recommends \
postgresql-client
RUN /usr/local/py-utils/bin/pipx install --system-site-packages --pip-args '--no-cache-dir --force-reinstall' isort && \
/usr/local/py-utils/bin/pipx install --system-site-packages --pip-args '--no-cache-dir --force-reinstall' poetry && \
pip install --upgrade pip
COPY .devcontainer/config/pypoetry_config.toml /home/vscode/.config/pypoetry/config.toml
RUN chown -R vscode:vscode /home/vscode/.config && \
/usr/local/py-utils/bin/poetry completions bash > /etc/bash_completion.d/poetry.bash-completion && \
python -m venv /workspace/.venv --prompt feria && \
chown -R vscode:vscode /workspace/.venv
Again, this is based mostly on Microsoft’s default but with a number of noteworthy changes:
- The
postgresql-client
OS package is installed in order to make thepsql
command available in the development environment. pipx
, which comes pre-installed in the Python image, is used to globally installisort
andpoetry
.- A configuration file for
poetry
is copied into the container (more on that in a bit). - Shell completion hints for
poetry
in thebash
shell are installed. - The development virtual environment is created. As indicated above, this can be
skipped, but I find it easier to work with a virtual environment rather than the
root
-owned global Python environment in the container. By placing the virtual environment inside the/workspace
folder, it will survive rebuilds of the container and some utilities in VS Code will detect it automatically.
The configuration file for poetry
simply tells poetry
not to create new virtual
environments. This is not absolutely necessary. If you prefer, you can omit both this
file and the virtual environment creation step in the Dockerfile
; poetry
will then
create a virtual environment for you when invoked. I just prefer to have control over
where the virtual environment is created for the reasons mentioned above.
[virtualenvs]
create = false
Finally, the devcontainer.json
file, based on Microsoft’s default:
{
"name": "{{ project name }}",
"dockerComposeFile": "docker-compose.yml",
"service": "app",
"workspaceFolder": "/workspace",
"settings": {
"editor.formatOnSave": true,
"python.analysis.extraPaths": [
"/workspace/source"
],
"python.defaultInterpreterPath": "/workspace/.venv/bin/python",
"python.formatting.blackPath": "/usr/local/py-utils/bin/black",
"python.formatting.provider": "black",
"python.languageServer": "Pylance",
"python.linting.enabled": true,
"python.linting.mypyPath": "/usr/local/py-utils/bin/mypy",
"python.linting.pylintEnabled": true,
"python.linting.pylintPath": "/workspace/.venv/bin/pylint",
"python.testing.pytestPath": "/usr/local/py-utils/bin/pytest",
"isort.path": [
"/usr/local/py-utils/bin/isort"
],
"rewrap.wrappingColumn": 88,
"[python]": {
"editor.codeActionsOnSave": {
"source.organizeImports": true
}
}
},
"extensions": [
"ms-python.python",
"ms-python.vscode-pylance",
"bungcip.better-toml"
],
"forwardPorts": [
{{ your project’s preferred port, default 8000 for Django }}
],
"postCreateCommand": "VIRTUAL_ENV=\"/workspace/.venv\" PATH=\"$VIRTUAL_ENV/bin:$PATH\" poetry install --no-interaction --no-ansi --with dev",
"remoteUser": "vscode"
}
Including some VS Code settings here can help your team establish consistent practices around code formatting and linting. Just as with the customizations above, it is not necessary but I find it to be good practice.
- A number of widely-used Python development utilities, like
Black and
pytest come pre-installed in the Python dev
container image; their paths are referenced in the
devcontainer.json
file. I have activated the ones I prefer to use. - Although Pylint also comes pre-installed, I
find it works better when installed inside the virtual environment; that is the version
that is referenced. We installed
isort
earlier, so that is also referenced. - I install a TOML extension for highlighting the
pyproject.toml
file; be sure to include any other extensions you like to use. - Finally, the
postCreateCommand
automatically usespoetry
to install any dependencies, including dev dependencies.
-
Née Cheeseshop. ↩
-
Note that the quotation marks surrounding the Python version number are necessary; otherwise,
3.10
will be interpreted as a floating point number and simplified to3.1
! Hopefully no one is still developing with Python 3.1 at this point. If you are using an Apple Silicon-based Mac, however, you should append-bullseye
to theVARIANT
value and then you needn’t use the quote marks as that will be interpreted as a string. Other versions of Python, such as3.11
, also don’t need the quote marks as they are not susceptible to the trailing zero simplification. ↩