Use it when
- You want to be able to version arbitrarily large files, datasets, and models using a similar workflow as Git.
- You want many options for remote storage of your data (S3, Minio, Google Cloud Storage, Google Drive, Azure Blob Storage, etc.)
- You want to create pipelines and track your experiments.
Watch out
- You must use DVC alongside a Git repository to enable its versioning features.
Example stacks
Airflow + MLflow stack
Installation
pip install dvc