Model serving

Cog is an open-source tool that lets you package machine learning models in a standard, production-ready container.

Use it when

  • You want to package your model in a Docker container without writing a Dockerfile.
  • You want GPU, CUDA, and cuDNN support without worrying about getting the setup and compatibility right.
  • You want an automatic and configurable FastAPI endpoint.
  • You want an out-of-the-box queue for serving long-running models based on Redis.

Example stacks

Airflow + MLflow stack