Use it when
- You want to package your model in a Docker container without writing a Dockerfile.
- You want GPU, CUDA, and cuDNN support without worrying about getting the setup and compatibility right.
- You want an automatic and configurable FastAPI endpoint.
- You want an out-of-the-box queue for serving long-running models based on Redis.
Example stacks
Airflow + MLflow stack