The arrow project uses docker compose to make it easier to reproduce environments for development and testing. While we don't have as complicated of an environment requirement, there are some places where this could help:

  1. SQL test cases would become more reproducible for developers

  2. We can more clearly delimitate what is required for particular development and testing workflows

Particularly for point 2 above, we right now have a rather monolithic environment.yml file that could be broken up into different docker compose services.

A sample file might look something like this:

  services:

  development:
    image: python:3.10
    build:
      context: .
      dockerfile: ci/docker/pandas-dev.dockerfile
    volumes:
      - .:/pandas

  lint:
    image: python:3.10
    build:
      context: .
      dockerfile: <ci/docker/pandas-lint.dockerfile
    volumes:
      - .:/pandas

  testing:
    image: python:3.10
    build:
      context: .
      dockerfile: <ci/docker/pandas-testing.dockerfile
    volumes:
      - .:/pandas

  sql-testing:
    image: python:3.10
    build:
      context: .
      dockerfile: <ci/docker/pandas-sql-testing.dockerfile
    volumes:
      - .:/pandas

This is just a rough sketch; We can get more advanced and parametrize the Python versions to match what we support