Tips Deploying Models to Production with MLflow

doanquang.trieu · Sep 29, 2023

[TIẾNG VIỆT]:
** Triển khai các mô hình để sản xuất với MLFlow **

Các mô hình học máy (ML) đang ngày càng trở nên quan trọng đối với các doanh nghiệp thuộc mọi quy mô.Tuy nhiên, việc triển khai các mô hình ML để sản xuất có thể là một thách thức.Điều này là do các mô hình ML thường phức tạp và yêu cầu quản lý cẩn thận để đảm bảo rằng chúng hoạt động tốt và đáng tin cậy.

MLFlow là một công cụ có thể giúp bạn triển khai các mô hình ML để sản xuất.MLFlow cung cấp một số tính năng có thể giúp bạn quản lý các mô hình ML của mình, bao gồm:

* Theo dõi: MLFlow theo dõi các thử nghiệm mà bạn chạy, bao gồm các tham số mà bạn đã sử dụng, kết quả mà bạn đạt được và các số liệu mà bạn đã tính toán.Thông tin này có thể được sử dụng để hiểu cách các mô hình của bạn thực hiện và để cải thiện chúng theo thời gian.
* Đăng ký mô hình: MLFlow cung cấp một sổ đăng ký mô hình nơi bạn có thể lưu trữ các mô hình ML của mình.Đăng ký mô hình có thể được sử dụng để quản lý vòng đời của các mô hình của bạn, bao gồm phiên bản, triển khai và giám sát.
* Phục vụ: MLFlow cung cấp dịch vụ phục vụ mà bạn có thể sử dụng để triển khai các mô hình ML của mình để sản xuất.Dịch vụ phục vụ có thể được sử dụng để mở rộng quy mô các mô hình của bạn để xử lý một lượng lớn lưu lượng truy cập.

Trong bài viết này, chúng tôi sẽ chỉ cho bạn cách triển khai mô hình ML để sản xuất bằng MLFlow.Chúng tôi sẽ sử dụng một ví dụ đơn giản về một mô hình dự đoán giá của một ngôi nhà.

## Điều kiện tiên quyết

Để làm theo với hướng dẫn này, bạn sẽ cần những điều sau đây:

* Môi trường Python với các gói sau được cài đặt:
* Mlflow
* Scikit-learn
* Một cuốn sổ Jupyter
* Nền tảng học máy dựa trên đám mây (như Google Cloud Platform hoặc Amazon Web Services)

## Bắt đầu

Chúng tôi sẽ bắt đầu bằng cách tạo ra một mô hình ML đơn giản để dự đoán giá của một ngôi nhà.Chúng tôi sẽ sử dụng [Bộ dữ liệu nhà ở California] (https://scikit-dearn.org/stable/datasets/index.html#california-housing-dataset) từ Scikit-Learn.

Chúng ta có thể tải bộ dữ liệu vào DataFrame Pandas bằng mã sau:

`` `Python
từ sklearn.datasets nhập fetch_california_housing

nhà ở = fetch_california_housing ()

hosing_df = pd.dataframe (housing.data, cột = hoscing.feature_names)
housing_df ['target'] = housing.target
`` `

Bây giờ chúng ta có thể đào tạo một mô hình hồi quy tuyến tính đơn giản để dự đoán giá của một ngôi nhà.Chúng ta có thể làm điều này bằng cách sử dụng mã sau:

`` `Python
từ sklearn.linear_model nhập tuyến tính tuyến tính

model = tuyến tính ()
model.fit (housing_df.drop ('Target', Axis = 1), housing_df ['Target'])))
`` `

Bây giờ chúng tôi có thể đánh giá hiệu suất của mô hình của chúng tôi bằng cách sử dụng mã sau:

`` `Python
từ sklearn.metrics nhập mean_squared_error

y_pred = model.predict (housing_df.drop ('mục tiêu', trục = 1)))
MSE = mean_squared_error (y_pred, housing_df ['target']))

print ('MSE: {}'. định dạng (MSE))
`` `

Chúng ta có thể thấy rằng mô hình của chúng ta có MSE là 0,095, đây là một điểm tương đối tốt.

## Thử nghiệm theo dõi

Bây giờ chúng tôi đã đào tạo một mô hình, chúng tôi có thể theo dõi nó bằng MLFlow.Để làm điều này, chúng ta có thể sử dụng mã sau:

`` `Python
Nhập mlflow

mlflow.set_experiment ('giá nhà dự đoán')

mlflow.log_params ({'model_type': 'linear_regression'}))
mlflow.log_metrics ({'MSE': MSE})
`` `

Mã này sẽ theo dõi thông tin sau về thử nghiệm của chúng tôi:

* Tên thử nghiệm
* Loại mô hình
* Điểm MSE

Chúng tôi có thể xem kết quả của thí nghiệm của chúng tôi trong UI MLFlow.Để làm điều này, chúng ta có thể đi đến URL sau:

`` `
http: // localhost: 5000/
`` `

Sau đó, chúng tôi có thể chọn thử nghiệm mà chúng tôi vừa tạo và xem kết quả.

## đăng ký mô hình

Bây giờ chúng tôi đã theo dõi thử nghiệm của mình, chúng tôi có thể đăng ký mô hình trong sổ đăng ký mô hình MLFlow.Để làm điều này, chúng tôi

[ENGLISH]:
**Deploying Models to Production with MLflow**

Machine learning (ML) models are becoming increasingly important for businesses of all sizes. However, deploying ML models to production can be a challenge. This is because ML models are often complex and require careful management to ensure that they are performing well and are reliable.

MLflow is a tool that can help you to deploy ML models to production. MLflow provides a number of features that can help you to manage your ML models, including:

* Tracking: MLflow tracks the experiments that you run, including the parameters that you used, the results that you achieved, and the metrics that you calculated. This information can be used to understand how your models perform and to improve them over time.
* Model registry: MLflow provides a model registry where you can store your ML models. The model registry can be used to manage the lifecycle of your models, including versioning, deployment, and monitoring.
* Serving: MLflow provides a serving service that you can use to deploy your ML models to production. The serving service can be used to scale your models to handle large amounts of traffic.

In this article, we will show you how to deploy an ML model to production using MLflow. We will use a simple example of a model that predicts the price of a house.

## Prerequisites

To follow along with this tutorial, you will need the following:

* A Python environment with the following packages installed:
* MLflow
* scikit-learn
* A Jupyter notebook
* A cloud-based machine learning platform (such as Google Cloud Platform or Amazon Web Services)

## Getting Started

We will start by creating a simple ML model to predict the price of a house. We will use the [California Housing dataset](https://scikit-learn.org/stable/datasets/index.html#california-housing-dataset) from scikit-learn.

We can load the dataset into a pandas dataframe using the following code:

```python
from sklearn.datasets import fetch_california_housing

housing = fetch_california_housing()

housing_df = pd.DataFrame(housing.data, columns=housing.feature_names)
housing_df['target'] = housing.target
```

We can now train a simple linear regression model to predict the price of a house. We can do this using the following code:

```python
from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(housing_df.drop('target', axis=1), housing_df['target'])
```

We can now evaluate the performance of our model using the following code:

```python
from sklearn.metrics import mean_squared_error

y_pred = model.predict(housing_df.drop('target', axis=1))
mse = mean_squared_error(y_pred, housing_df['target'])

print('MSE: {}'.format(mse))
```

We can see that our model has an MSE of 0.095, which is a relatively good score.

## Tracking Experiments

Now that we have trained a model, we can track it using MLflow. To do this, we can use the following code:

```python
import mlflow

mlflow.set_experiment('house-price-prediction')

mlflow.log_params({'model_type': 'linear_regression'})
mlflow.log_metrics({'mse': mse})
```

This code will track the following information about our experiment:

* The experiment name
* The model type
* The MSE score

We can view the results of our experiment in the MLflow UI. To do this, we can go to the following URL:

```
http://localhost:5000/```

We can then select the experiment that we just created and view the results.

## Registering the Model

Now that we have tracked our experiment, we can register the model in the MLflow model registry. To do this, we

Tips Deploying Models to Production with MLflow

doanquang.trieu

New member

Latest posts