MLOps: Model management, deployment, and monitoring with Azure Machine Learning

APPLIES TO: Azure CLI ml extension v2 (current) Python SDK azure-ai-ml v2 (current)

In this article, learn how to apply Machine Learning Operations (MLOps) practices in Azure Machine Learning for the purpose of managing the lifecycle of your models. Applying MLOps practices can improve the quality and consistency of your machine learning solutions.

What is MLOps?

MLOps is based on DevOps principles and practices that increase the efficiency of workflows. Examples include continuous integration, delivery, and deployment. MLOps applies these principles to the machine learning process, with the goal of:

Faster experimentation and development of models.
Faster deployment of models into production.
Quality assurance and end-to-end lineage tracking.

MLOps in Machine Learning

Machine Learning provides the following MLOps capabilities:

Create reproducible machine learning pipelines. Use machine learning pipelines to define repeatable and reusable steps for your data preparation, training, and scoring processes.
Create reusable software environments. Use these environments for training and deploying models.
Register, package, and deploy models from anywhere. You can also track associated metadata required to use the model.
Capture the governance data for the end-to-end machine learning lifecycle. The logged lineage information can include who is publishing models and why changes were made. It can also include when models were deployed or used in production.
Notify and alert on events in the machine learning lifecycle. Event examples include experiment completion, model registration, model deployment, and data drift detection.
Monitor machine learning applications for operational and machine learning-related issues. Compare model inputs between training and inference. Explore model-specific metrics. Provide monitoring and alerts on your machine learning infrastructure.
Automate the end-to-end machine learning lifecycle with Machine Learning and Azure Pipelines. By using pipelines, you can frequently update models. You can also test new models. You can continually roll out new machine learning models alongside your other applications and services.

For more information on MLOps, see Machine learning DevOps.

Create reproducible machine learning pipelines

Use machine learning pipelines from Machine Learning to stitch together all the steps in your model training process.

A machine learning pipeline can contain steps from data preparation to feature extraction to hyperparameter tuning to model evaluation. For more information, see Machine learning pipelines.

If you use the designer to create your machine learning pipelines, you can at any time select the ... icon in the upper-right corner of the designer page. Then select Clone. When you clone your pipeline, you iterate your pipeline design without losing your old versions.

Create reusable software environments

By using Machine Learning environments, you can track and reproduce your projects' software dependencies as they evolve. You can use environments to ensure that builds are reproducible without manual software configurations.

Environments describe the pip and conda dependencies for your projects. You can use them for training and deployment of models. For more information, see What are Machine Learning environments?.

Register, package, and deploy models from anywhere

The following sections discuss how to register, package, and deploy models.

Register and track machine learning models

With model registration, you can store and version your models in the Azure cloud, in your workspace. The model registry makes it easy to organize and keep track of your trained models.

注册型号通过名称和版本进行识别。每次您注册一个与现有模型同名的模型时，注册中心都会增加版本。在注册期间可以提供更多的元数据标签。当您搜索模型时，会用到这些标签。机器学习支持任何可以使用Python 3.5.2或更高版本加载的模型。

打包和调试模型

在您将一个模型部署到产品中之前，它& # 39；打包成一个Docker映像。在大多数情况下，映像创建是在部署期间在后台自动进行的。您可以手动指定图像。

如果在部署中遇到问题，可以在本地开发环境中部署，以便进行故障排除和调试。

转换和优化模型

将您的模型转换为开放式神经网络交换(ONNX)可能会提高性能。平均而言，转换为ONNX可以使性能翻倍。

有关ONNX与机器学习的更多信息，请参见创建和加速机器学习模型。

使用模型

经过训练的机器学习模型被部署为云中或本地的端点。部署使用CPU、GPU进行推理。

将模型部署为端点时，您需要提供以下项目:

用于对提交给服务或设备的数据进行评分的模型。
参赛剧本。该脚本接受请求，使用模型对数据进行评分，并返回响应。
描述模型和条目脚本所需的pip和conda依赖关系的机器学习环境。
模型和输入脚本所需的任何其他资产，例如文本和数据。

您还需要提供目标部署平台的配置。例如，虚拟机系列类型、可用内存和内核数量。创建映像时，还会添加Azure机器学习所需的组件。例如，运行web服务所需的资产。

批量评分

批处理端点支持批处理评分。有关更多信息，请参见端点。

在线端点

Notify, automate, and alert on events in the machine learning lifecycle
Machine Learning publishes key events to Azure Event Grid, which can be used to notify and automate on events in the machine learning lifecycle. For more information, see Use Event Grid.
Automate the machine learning lifecycle

You can use GitHub and Azure Pipelines to create a continuous integration process that trains a model. In a typical scenario, when a data scientist checks a change into the Git repo for a project, Azure Pipelines starts a training job. The results of the job can then be inspected to see the performance characteristics of the trained model. You can also create a pipeline that deploys the model as a web service.

The Machine Learning extension makes it easier to work with Azure Pipelines. It provides the following enhancements to Azure Pipelines:
Enables workspace selection when you define a service connection.
Enables release pipelines to be triggered by trained models created in a training pipeline.

For more information on using Azure Pipelines with Machine Learning, see:

Next steps

Learn more by reading and exploring the following resources:

Create multiple versions of an endpoint for a deployment
Perform A/B testing by routing traffic to different deployments within the endpoint.
Switch between endpoint deployments by updating the traffic percentage in endpoint configuration.

Analytics

Microsoft Power BI supports using machine learning models for data analytics. For more information, see Machine Learning integration in Power BI (preview).

Capture the governance data required for MLOps

Machine Learning gives you the capability to track the end-to-end audit trail of all your machine learning assets by using metadata. For example:

Machine Learning datasets help you track, profile, and version data.
Interpretability allows you to explain your models, meet regulatory compliance, and understand how models arrive at a result for specific input.
Machine Learning Job history stores a snapshot of the code, data, and computes used to train a model.
The Machine Learning Model Registry captures all the metadata associated with your model. For example, metadata includes which experiment trained it, where it's being deployed, and if its deployments are healthy.
Integration with Azure allows you to act on events in the machine learning lifecycle. Examples are model registration, deployment, data drift, and training (job) events.

Notify, automate, and alert on events in the machine learning lifecycle

Machine Learning publishes key events to Azure Event Grid, which can be used to notify and automate on events in the machine learning lifecycle. For more information, see Use Event Grid.

Automate the machine learning lifecycle

The Machine Learning extension makes it easier to work with Azure Pipelines. It provides the following enhancements to Azure Pipelines:

Enables workspace selection when you define a service connection.
Enables release pipelines to be triggered by trained models created in a training pipeline.

For more information on using Azure Pipelines with Machine Learning, see:

Next steps

Learn more by reading and exploring the following resources:

MLOps: Model management, deployment, and monitoring with Azure Machine Learning

What is MLOps?

MLOps in Machine Learning

Create reproducible machine learning pipelines

Create reusable software environments

Register, package, and deploy models from anywhere

Register and track machine learning models

打包和调试模型

转换和优化模型

使用模型

批量评分

在线端点

Next steps

Analytics

Capture the governance data required for MLOps

Notify, automate, and alert on events in the machine learning lifecycle

Automate the machine learning lifecycle

Next steps

《文心一言》内测邀请码要怎么获取 《文心一言》内测邀请码获取的方法

美国人说好用，中国人说垃圾，百度“文心”击碎了谁的心？

文心一言等待体验要多久？ 个人用户申请体验所需时间介绍

最新：百度文心一言大模型突破性成果！

怎么无邀请码用到文心一言

百度文心一言发布会，多少有点让人失望

百度“文心一言”发布会，开了不如不开

9 Top Machine Learning Algorithms for Predictive Modeling

2023王者荣耀v10要充值多少钱？王者荣耀内容分享

Microsoft removes waitlist to access Bing Chat GPT

Deploy machine learning models in production environments

Deploy a model as an online endpoint

机器学习模型部署

See how employees at top companies are mastering in

百度：文心一言发布首日签约5家客户 已有6.5万家企业申请测试

ML Model Deployment Strategies

What Is Model Deployment in Machine Learning?

Machine Learning Model Deployment

How to Deploy a Machine Learning Model for Free – 7 ML Model Deployment Cloud Platforms

How to Deploy your Machine Learning Models

《文心一言》内测邀请码要怎么获取《文心一言》内测邀请码获取的方法

文心一言等待体验要多久？个人用户申请体验所需时间介绍

百度：文心一言发布首日签约5家客户已有6.5万家企业申请测试