DVC logo

DVC

(0)
Experiment tracking
Data versioning
Pipeline orchestration

DVC is an open-source version control system for machine learning projects.

Use it when

  • You want to be able to version arbitrarily large files, datasets, and models using a similar workflow as Git.
  • You want many options for remote storage of your data (S3, Minio, Google Cloud Storage, Google Drive, Azure Blob Storage, etc.)
  • You want to create pipelines and track your experiments.

Watch out

  • You must use DVC alongside a Git repository to enable its versioning features.

Example stacks

Airflow + MLflow stack

Installation

pip install dvc

Reviews