Use it when
- You want to create workflows as DAGs (Direct Acyclic Graphs).
- You want to define workflows as standalone objects.
- You need fast scheduling of DAGs.
- You want to explicitly define input and output for individual jobs to streamline data movement between tasks.
- You want to cache and persist inputs and outputs.
- You want a transform function that accepts both reference data (batch) and live data.
- You want an easy way to create dynamic workflows.
- You want to generate pipelines as Python code.
- You want to automate ML workflows.
Watch out
- The abstraction of computing and storage is limited, making local development tricky with large datasets.
Example stacks
Airflow + MLflow stack
Installation
pip install prefect