Containerized Execution

pyCyto supports hybrid execution, allowing pipeline tasks to run either locally (baremetal) or in isolated container environments — useful when different tasks need different, sometimes conflicting, dependency versions (e.g. StarDist’s TensorFlow vs. Cellpose’s PyTorch).

Overview

  • Hybrid Execution: mix container and baremetal execution within the same pipeline

  • Environment Isolation: reproducible, consistent execution environments per task

  • Multiple Runners: Docker (and Singularity, for cluster deployments)

  • Backward Compatibility: existing pipeline YAMLs continue to work unchanged

Configuration

Pipeline tasks are defined using a DAG-based YAML format:

# pipeline.container_example.yaml
tasks:
  tcell_segmentation:
    module: cyto.segmentation.cellpose.CellPose
    params:
      model_type: cyto2
      cellprob_thresh: -1
      gpu: true
      channels: [0, 0]
      batch_size: 8
      diameter: 25
      verbose: true
    dependencies: []
    tags:
      - segmentation
      - tcells

Execution environments and compute resources are defined separately:

# pipeline-resources.container_example.yaml
execution_configs:
  # Hybrid execution - segmentation in containers, features on baremetal
  hybrid:
    tcell_segmentation:
      type: container
      runner: docker
      image: cellpose:latest
      resources:
        memory: 8Gi
        gpu: 1

    tcell_features:
      type: baremetal  # Run inside the same container as the pipeline engine

  # Full container execution
  container:
    tcell_segmentation:
      type: container
      runner: docker
      image: cellpose:latest

  # All baremetal (default when running everything inside one container)
  local:
    tcell_segmentation:
      type: baremetal
    tcell_features:
      type: baremetal

default_profile: hybrid

Usage

# Run with hybrid execution (default)
cyto --pipeline pipelines/pipeline.container_example.yaml -v

# The system automatically loads the matching resources file:
# pipelines/pipeline-resources.container_example.yaml

Profiles are selected in this order:

  1. execution_profile set in the pipeline YAML

  2. default_profile in the resources YAML

  3. System default (hybrid)

Troubleshooting

Container Not Found: build or pull the referenced image before running the pipeline.

Docker Permission Issues (Linux):

sudo usermod -aG docker $USER
# Logout and login again

Resource Constraints: reduce resources.memory / disable resources.gpu in the resources YAML if the host can’t satisfy the requested profile.