Use it when
- You are training large deep learning models that would benefit from GPU parallelization.
- You want compatibility with TensorFlow, Keras, PyTorch, and Apache MXNet
- You want the flexibility to run a script adapted to work with Horovod in a single GPU, multiple GPUs, or multiple nodes without any further code changes.
Watch out
- You must reinstall Horovod when upgrading or downgrading TensorFlow, Keras, or PyTorch.
- Requires adapting your training script.
- Serving the model requires optimization to remove references to Horovod.
Example stacks
Airflow + MLflow stack
Installation
pip install horovod