TorchServe logo


Model serving

TorchServe is a flexible and easy-to-use tool for serving and scaling PyTorch models in production.

Use it when

  • You want to serve PyTorch models and do not need a framework-agnostic serving tool.
  • You want integration with popular tools such as KServe, Kubeflow, MLflow, Sagemaker, and Vertex AI.
  • You want REST and gRPC support for batch inference.
  • You want to version and scale your models.
  • You want support for exporting metrics to Prometheus.
  • You want to serve an ensemble of PyTorch models and Python functions executed as a DAG defined by a workflow.

Example stacks

Airflow + MLflow stack


git clone
cd serve
python ./ts_scripts/ --cuda=cu111
pip install torchserve torch-model-archiver torch-workflow-archiver