Wednesday, November 16, 2022

Dev Container Setup for Python

I’ve written a lot recently about Swift, but Python remains the programming language that I use the most. And given that Visual Studio Code’s Dev Containers feature has become my favorite way to manage my development environments, it occurred to me that I should detail my Dev Container setup for Python. As a reminder, Dev Containers allow you to use one or more Docker containers as a development environment, enabling you to have a fully isolated, similar-to-production, Linux-based environment in which to code. And all of the configuration files can be committed to your repository, allowing all team members to consistently use the same environment.

First, The Inevitable Aside on Python Packaging

One aspect of Python that many developers find extremely compelling is the vibrant and active community of third-party, open source packages. If you ever find yourself running into a problem and thinking, “Surely someone out there has figured this out before,” there’s a very good chance that an installable package is available that solves it for you. Indeed, at the time this post was written, there are 415,359 packages available in Python’s canonical package repository, the Python Package Index (PyPI).1

Unforunately, while Python has long had this cornucopia of installable packages, the details of installing those packages has been... not great. By default, packages are installed in a directory that is global to Python; any Python project you use or work on accesses the same global set of packages. That’s fine until you happen to work on multiple projects with conflicting dependencies. So then came virtual environments which allowed you to have separate pockets of Python on your system, each with their own set of packages. But virtual enviroments didn’t help you keep track of the packages you needed to install, so along came pip-tools, which would write out your dependencies to a text file. But pip-tools didn’t help you manage your virtual environments or help you manage conflicts among your dependencies’ dependencies so along came pipenv and poetry to help you with all of that.

You might be forgiven for thinking that none of this packaging messiness applies to Dev Containers, since they give each project a fully isolated virtual machine. And indeed, it is possible to simply use the global Python environment inside each project’s Dev Container. I have found this solution to be lacking, however, for the following reasons:

  1. I invaribly run into permissions problems trying to install in the Dev Container’s global Python environment as it is owned by root while development usually happens under the vscode user. This can be worked around, but it tends to leave things in a quirky state.

  2. It is still good practice to track your project’s dependencies and to pin known compatible verions of them in a way that lends itself to reliable replication. And it is still good practice to update your dependencies and your dependencies’ dependencies on a regular basis. The pipenv and poetry tools were designed for just this sort of thing and they excel at it.

  3. Other developers on your team may not be using Dev Containers, and your production environment certainly won’t, so you still need good dependency hygiene even outside of a Dev Container context.

All of this is to say that I still recommend using virtual environments and dependency managers inside of Dev Containers. More specifically, I strongly recommend using either pipenv or poetry to manage your dependency tree. They are both excellent, so it is difficult to universally recommend one over the other. That said, poetry has adopted the pyproject.toml file as the way to record top-level dependencies. pyproject.toml has become the recommended way to record a project’s metadata, so these days I tend to use poetry.

Now, the Actual Dev Container Setup

First, the docker-compose.yml file creates our overall container environment:

.devcontainer/docker-compose.yml

version: '3.8'
services:
  app:
    build:
      context: ..
      dockerfile: .devcontainer/Dockerfile
      args:
        VARIANT: "3.10"
        NODE_VERSION: "none"
    volumes:
      - ..:/workspace:cached
    command: sleep infinity
    network_mode: service:db
    user: vscode
    env_file: .env
  db:
    image: postgres:14.5
    restart: unless-stopped
    env_file: postgres.env
volumes:
  postgres-data:

Based mostly on Microsoft’s default, this compose file specifies 3.10 as the target version of Python2 and avoids installing Node. Some key differences from the default version of the file:

  • I specify an environment file from which environment variables will be set when the container builds. This file — .devcontainer/.env — must be present in the .devcontainer folder and valid or else the container will fail to build.
  • I prefer to pin the version of PostgreSQL I use rather than just latest, as I want to use the same version that is in my production environment.
  • I also use an environment file — .devcontainer/postgres.env — to specify the database username and password. Be sure to add *.env to your .gitignore file so you don’t accidentally include the environment files in your repository.

The contents of your .env file will depend on the kind of project you are working on; you might not even need it at all. The postgres.env is rather simple:

.devcontainer/postgres.env

POSTGRES_DB={{ preferred DB name }}
POSTGRES_USER={{ preferred DB username }}
POSTGRES_PASSWORD={{ preferred DB password }}

Next, the Dockerfile for the main development container:

.devcontainer/Dockerfile

ARG VARIANT=3
FROM mcr.microsoft.com/vscode/devcontainers/python:0-${VARIANT}
ENV PYTHONUNBUFFERED 1
ARG NODE_VERSION="none"
RUN if [ "${NODE_VERSION}" != "none" ]; then su vscode -c "umask 0002 && . /usr/local/share/nvm/nvm.sh && nvm install ${NODE_VERSION} 2>&1"; fi
RUN apt-get update && export DEBIAN_FRONTEND=noninteractive \
    && apt-get -y install --no-install-recommends \
    postgresql-client
RUN /usr/local/py-utils/bin/pipx install --system-site-packages --pip-args '--no-cache-dir --force-reinstall' isort && \
    /usr/local/py-utils/bin/pipx install --system-site-packages --pip-args '--no-cache-dir --force-reinstall' poetry && \
    pip install --upgrade pip
COPY .devcontainer/config/pypoetry_config.toml /home/vscode/.config/pypoetry/config.toml
RUN chown -R vscode:vscode /home/vscode/.config && \
    /usr/local/py-utils/bin/poetry completions bash > /etc/bash_completion.d/poetry.bash-completion && \
    python -m venv /workspace/.venv --prompt feria && \
    chown -R vscode:vscode /workspace/.venv

Again, this is based mostly on Microsoft’s default but with a number of noteworthy changes:

  • The postgresql-client OS package is installed in order to make the psql command available in the development environment.
  • pipx, which comes pre-installed in the Python image, is used to globally install isort and poetry.
  • A configuration file for poetry is copied into the container (more on that in a bit).
  • Shell completion hints for poetry in the bash shell are installed.
  • The development virtual environment is created. As indicated above, this can be skipped, but I find it easier to work with a virtual environment rather than the root-owned global Python environment in the container. By placing the virtual environment inside the /workspace folder, it will survive rebuilds of the container and some utilities in VS Code will detect it automatically.

The configuration file for poetry simply tells poetry not to create new virtual environments. This is not absolutely necessary. If you prefer, you can omit both this file and the virtual environment creation step in the Dockerfile; poetry will then create a virtual environment for you when invoked. I just prefer to have control over where the virtual environment is created for the reasons mentioned above.

.devcontainer/config/pypoetry_config.toml

[virtualenvs]
create = false

Finally, the devcontainer.json file, based on Microsoft’s default:

.devcontainer/devcontainer.json

{
    "name": "{{ project name }}",
    "dockerComposeFile": "docker-compose.yml",
    "service": "app",
    "workspaceFolder": "/workspace",
    "settings": {
        "editor.formatOnSave": true,
        "python.analysis.extraPaths": [
            "/workspace/source"
        ],
        "python.defaultInterpreterPath": "/workspace/.venv/bin/python",
        "python.formatting.blackPath": "/usr/local/py-utils/bin/black",
        "python.formatting.provider": "black",
        "python.languageServer": "Pylance",
        "python.linting.enabled": true,
        "python.linting.mypyPath": "/usr/local/py-utils/bin/mypy",
        "python.linting.pylintEnabled": true,
        "python.linting.pylintPath": "/workspace/.venv/bin/pylint",
        "python.testing.pytestPath": "/usr/local/py-utils/bin/pytest",
        "isort.path": [
            "/usr/local/py-utils/bin/isort"
        ],
        "rewrap.wrappingColumn": 88,
        "[python]": {
            "editor.codeActionsOnSave": {
                "source.organizeImports": true
            }
        }
    },
    "extensions": [
        "ms-python.python",
        "ms-python.vscode-pylance",
        "bungcip.better-toml"
    ],
    "forwardPorts": [
        {{ your project’s preferred port, default 8000 for Django }}
    ],
    "postCreateCommand": "VIRTUAL_ENV=\"/workspace/.venv\" PATH=\"$VIRTUAL_ENV/bin:$PATH\" poetry install --no-interaction --no-ansi --with dev",
    "remoteUser": "vscode"
}

Including some VS Code settings here can help your team establish consistent practices around code formatting and linting. Just as with the customizations above, it is not necessary but I find it to be good practice.

  • A number of widely-used Python development utilities, like Black and pytest come pre-installed in the Python dev container image; their paths are referenced in the devcontainer.json file. I have activated the ones I prefer to use.
  • Although Pylint also comes pre-installed, I find it works better when installed inside the virtual environment; that is the version that is referenced. We installed isort earlier, so that is also referenced.
  • I install a TOML extension for highlighting the pyproject.toml file; be sure to include any other extensions you like to use.
  • Finally, the postCreateCommand automatically uses poetry to install any dependencies, including dev dependencies.

  1. Née Cheeseshop

  2. Note that the quotation marks surrounding the Python version number are necessary; otherwise, 3.10 will be interpreted as a floating point number and simplified to 3.1! Hopefully no one is still developing with Python 3.1 at this point. If you are using an Apple Silicon-based Mac, however, you should append -bullseye to the VARIANT value and then you needn’t use the quote marks as that will be interpreted as a string. Other versions of Python, such as 3.11, also don’t need the quote marks as they are not susceptible to the trailing zero simplification.