MLflow Guide

MLflow is an open source platform for managing the end-to-end machine learning lifecycle. It has three primary components: Tracking, Models, and Projects:

  • Tracking: Allows you to track experiments to record and compare parameters and results.
  • Models: Allow you to manage and deploy models from a variety of ML libraries to a variety of model serving and inference platforms.
  • Projects: Allow you to package ML code in a reusable, reproducible form to share with other data scientists or transfer to production.

MLflow supports Java, Python, R, and REST APIs.

Azure Databricks provides a fully managed and hosted version of MLflow integrated with enterprise security features, high availability, and other Azure Databricks workspace features such as experiment and run management and notebook revision capture. MLflow on Azure Databricks offers an integrated experience for tracking and securing machine learning model training runs and running machine learning projects.

Preview

  • This feature is in Public Preview.
  • The R API is not supported in the Public Preview, but is under development.
  • For pricing information, see the Databricks pricing page.

The first topic provides a quick start that demonstrates the basic MLflow tracking APIs. The subsequent topics introduce each MLflow component and describe how these components are hosted within Azure Databricks. They include numerous Azure Databricks notebooks and examples that illustrate how to use each MLflow component.