Jupyter on Kubernetes机器学习-MLflow

2019独角兽企业重金招聘Python工程师标准>>> hot3.png

Jupyter on Kubernetes机器学习-MLFlow

  • MLFlow使用教程,https://my.oschina.net/u/2306127/blog/1825690
  • MLFlow官方文档,https://www.mlflow.org/docs/latest/quickstart.html
  • 快速安装: pip install mlflow
#下载代码
!git clone https://github.com/databricks/mlflow
Cloning into 'mlflow'...
remote: Counting objects: 830, done.
remote: Compressing objects: 100% (22/22), done.
remote: Total 830 (delta 5), reused 11 (delta 4), pack-reused 804
Receiving objects: 100% (830/830), 3.04 MiB | 28.00 KiB/s, done.
Resolving deltas: 100% (339/339), done.
Checking out files: 100% (279/279), done.
%%!
export https_proxy=http://192.168.199.99:9999
echo $https_proxy
#pip install mlflow
['http://192.168.199.99:9999']
!pip install mlflow
Collecting mlflow
  Downloading https://files.pythonhosted.org/packages/65/a0/082dcecdd76845ee8e97472741a5315e6dc697e2552935a73bdb6196d515/mlflow-0.2.1.tar.gz (4.3MB)
    100% |████████████████████████████████| 4.3MB 153kB/s 
Collecting awscli (from mlflow)
  Downloading https://files.pythonhosted.org/packages/b5/dd/84d32d2275ea16daf09d561858dd0e615c56c9e8afb2e9b42d02bc45e417/awscli-1.15.51-py2.py3-none-any.whl (1.3MB)
    100% |████████████████████████████████| 1.3MB 129kB/s 
Collecting click>=6.7 (from mlflow)
  Downloading https://files.pythonhosted.org/packages/34/c1/8806f99713ddb993c5366c362b2f908f18269f8d792aff1abfd700775a77/click-6.7-py2.py3-none-any.whl (71kB)
    100% |████████████████████████████████| 71kB 102kB/s 
Collecting databricks-cli (from mlflow)
  Downloading https://files.pythonhosted.org/packages/58/78/4bda6f29a091ab7b0ad29efdba2491e5d0b56bd09d608857e6f0b799be48/databricks-cli-0.7.2.tar.gz
Requirement already satisfied: requests>=2.17.3 in /opt/conda/lib/python3.6/site-packages (from mlflow) (2.19.1)
Requirement already satisfied: six>=1.10.0 in /opt/conda/lib/python3.6/site-packages (from mlflow) (1.11.0)
Collecting uuid (from mlflow)
  Downloading https://files.pythonhosted.org/packages/ce/63/f42f5aa951ebf2c8dac81f77a8edcc1c218640a2a35a03b9ff2d4aa64c3d/uuid-1.30.tar.gz
Collecting gitpython (from mlflow)
  Downloading https://files.pythonhosted.org/packages/ac/c9/96d7c86c623cb065976e58c0f4898170507724d6b4be872891d763d686f4/GitPython-2.1.10-py2.py3-none-any.whl (449kB)
    100% |████████████████████████████████| 450kB 108kB/s 
Collecting gunicorn (from mlflow)
  Downloading https://files.pythonhosted.org/packages/8c/da/b8dd8deb741bff556db53902d4706774c8e1e67265f69528c14c003644e6/gunicorn-19.9.0-py2.py3-none-any.whl (112kB)
    100% |████████████████████████████████| 122kB 74kB/s 
Collecting Flask (from mlflow)
  Downloading https://files.pythonhosted.org/packages/7f/e7/08578774ed4536d3242b14dacb4696386634607af824ea997202cd0edb4b/Flask-1.0.2-py2.py3-none-any.whl (91kB)
    100% |████████████████████████████████| 92kB 47kB/s 
Requirement already satisfied: numpy in /opt/conda/lib/python3.6/site-packages (from mlflow) (1.13.3)
Requirement already satisfied: pandas in /opt/conda/lib/python3.6/site-packages (from mlflow) (0.23.1)
Requirement already satisfied: scipy in /opt/conda/lib/python3.6/site-packages (from mlflow) (1.1.0)
Requirement already satisfied: scikit-learn in /opt/conda/lib/python3.6/site-packages (from mlflow) (0.19.1)
Requirement already satisfied: python-dateutil in /opt/conda/lib/python3.6/site-packages (from mlflow) (2.7.3)
Collecting protobuf>=3.6.0 (from mlflow)
  Downloading https://files.pythonhosted.org/packages/fc/f0/db040681187496d10ac50ad167a8fd5f953d115b16a7085e19193a6abfd2/protobuf-3.6.0-cp36-cp36m-manylinux1_x86_64.whl (7.1MB)
    100% |████████████████████████████████| 7.1MB 136kB/s 
Requirement already satisfied: pyyaml in /opt/conda/lib/python3.6/site-packages (from mlflow) (3.12)
Collecting boto3 (from mlflow)
  Downloading https://files.pythonhosted.org/packages/59/f0/22554f0fc3aafd34e189919fd6a360d440fcaa6f86dedc9aaca904c885b1/boto3-1.7.50-py2.py3-none-any.whl (128kB)
    100% |████████████████████████████████| 133kB 211kB/s 
Collecting querystring_parser (from mlflow)
  Downloading https://files.pythonhosted.org/packages/57/64/3086a9a991ff3aca7b769f5b0b51ff8445a06337ae2c58f215bcee48f527/querystring_parser-1.2.3.tar.gz
Collecting docutils>=0.10 (from awscli->mlflow)
  Downloading https://files.pythonhosted.org/packages/36/fa/08e9e6e0e3cbd1d362c3bbee8d01d0aedb2155c4ac112b19ef3cae8eed8d/docutils-0.14-py3-none-any.whl (543kB)
    100% |████████████████████████████████| 552kB 213kB/s 
Collecting botocore==1.10.50 (from awscli->mlflow)
  Downloading https://files.pythonhosted.org/packages/d5/9f/2e701a365b5ff0e8b664d6c393f3c61c20e52bb5148bbc2e27d737b890db/botocore-1.10.50-py2.py3-none-any.whl (4.4MB)
    100% |████████████████████████████████| 4.4MB 221kB/s 
Requirement already satisfied: rsa<=3.5.0,>=3.1.2 in /opt/conda/lib/python3.6/site-packages (from awscli->mlflow) (3.4.2)
Collecting colorama<=0.3.9,>=0.2.5 (from awscli->mlflow)
  Downloading https://files.pythonhosted.org/packages/db/c8/7dcf9dbcb22429512708fe3a547f8b6101c0d02137acbd892505aee57adf/colorama-0.3.9-py2.py3-none-any.whl
Collecting s3transfer<0.2.0,>=0.1.12 (from awscli->mlflow)
  Downloading https://files.pythonhosted.org/packages/d7/14/2a0004d487464d120c9fb85313a75cd3d71a7506955be458eebfe19a6b1d/s3transfer-0.1.13-py2.py3-none-any.whl (59kB)
    100% |████████████████████████████████| 61kB 267kB/s 
Collecting tabulate>=0.7.7 (from databricks-cli->mlflow)
  Downloading https://files.pythonhosted.org/packages/12/c2/11d6845db5edf1295bc08b2f488cf5937806586afe42936c3f34c097ebdc/tabulate-0.8.2.tar.gz (45kB)
    100% |████████████████████████████████| 51kB 209kB/s 
Collecting configparser>=0.3.5 (from databricks-cli->mlflow)
  Downloading https://files.pythonhosted.org/packages/7c/69/c2ce7e91c89dc073eb1aa74c0621c3eefbffe8216b3f9af9d3885265c01c/configparser-3.5.0.tar.gz
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/conda/lib/python3.6/site-packages (from requests>=2.17.3->mlflow) (3.0.4)
Requirement already satisfied: idna<2.8,>=2.5 in /opt/conda/lib/python3.6/site-packages (from requests>=2.17.3->mlflow) (2.7)
Requirement already satisfied: urllib3<1.24,>=1.21.1 in /opt/conda/lib/python3.6/site-packages (from requests>=2.17.3->mlflow) (1.23)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.6/site-packages (from requests>=2.17.3->mlflow) (2018.4.16)
Collecting gitdb2>=2.0.0 (from gitpython->mlflow)
  Downloading https://files.pythonhosted.org/packages/e0/95/c772c13b7c5740ec1a0924250e6defbf5dfdaee76a50d1c47f9c51f1cabb/gitdb2-2.0.3-py2.py3-none-any.whl (63kB)
    100% |████████████████████████████████| 71kB 701kB/s 
Collecting itsdangerous>=0.24 (from Flask->mlflow)
  Downloading https://files.pythonhosted.org/packages/dc/b4/a60bcdba945c00f6d608d8975131ab3f25b22f2bcfe1dab221165194b2d4/itsdangerous-0.24.tar.gz (46kB)
    100% |████████████████████████████████| 51kB 122kB/s 
Collecting Werkzeug>=0.14 (from Flask->mlflow)
  Downloading https://files.pythonhosted.org/packages/20/c4/12e3e56473e52375aa29c4764e70d1b8f3efa6682bef8d0aae04fe335243/Werkzeug-0.14.1-py2.py3-none-any.whl (322kB)
    100% |████████████████████████████████| 327kB 366kB/s 
Requirement already satisfied: Jinja2>=2.10 in /opt/conda/lib/python3.6/site-packages (from Flask->mlflow) (2.10)
Requirement already satisfied: pytz>=2011k in /opt/conda/lib/python3.6/site-packages (from pandas->mlflow) (2018.4)
Requirement already satisfied: setuptools in /opt/conda/lib/python3.6/site-packages (from protobuf>=3.6.0->mlflow) (39.2.0)
Collecting jmespath<1.0.0,>=0.7.1 (from boto3->mlflow)
  Downloading https://files.pythonhosted.org/packages/b7/31/05c8d001f7f87f0f07289a5fc0fc3832e9a57f2dbd4d3b0fee70e0d51365/jmespath-0.9.3-py2.py3-none-any.whl
Requirement already satisfied: pyasn1>=0.1.3 in /opt/conda/lib/python3.6/site-packages (from rsa<=3.5.0,>=3.1.2->awscli->mlflow) (0.4.3)
Collecting smmap2>=2.0.0 (from gitdb2>=2.0.0->gitpython->mlflow)
  Downloading https://files.pythonhosted.org/packages/e3/59/4e22f692e65f5f9271252a8e63f04ce4ad561d4e06192478ee48dfac9611/smmap2-2.0.3-py2.py3-none-any.whl
Requirement already satisfied: MarkupSafe>=0.23 in /opt/conda/lib/python3.6/site-packages (from Jinja2>=2.10->Flask->mlflow) (1.0)
Building wheels for collected packages: mlflow, databricks-cli, uuid, querystring-parser, tabulate, configparser, itsdangerous
  Running setup.py bdist_wheel for mlflow ... done
  Stored in directory: /home/jovyan/.cache/pip/wheels/fd/ef/05/d1a5e684ca724530d9e255a1052867461ed79ba163f7f8da03
  Running setup.py bdist_wheel for databricks-cli ... done
  Stored in directory: /home/jovyan/.cache/pip/wheels/ed/db/48/ec3b28dbc74ec2e2fe4d175efdcdddc64a37f855105fe650d5
  Running setup.py bdist_wheel for uuid ... done
  Stored in directory: /home/jovyan/.cache/pip/wheels/2a/80/9b/015026567c29fdffe31d91edbe7ba1b17728db79194fca1f21
  Running setup.py bdist_wheel for querystring-parser ... done
  Stored in directory: /home/jovyan/.cache/pip/wheels/ee/09/99/bf937e4f02788fa8b33dc5240842ba3977ba5c3c4ad4a115d7
  Running setup.py bdist_wheel for tabulate ... done
  Stored in directory: /home/jovyan/.cache/pip/wheels/2a/85/33/2f6da85d5f10614cbe5a625eab3b3aebfdf43e7b857f25f829
  Running setup.py bdist_wheel for configparser ... done
  Stored in directory: /home/jovyan/.cache/pip/wheels/a3/61/79/424ef897a2f3b14684a7de5d89e8600b460b89663e6ce9d17c
  Running setup.py bdist_wheel for itsdangerous ... done
  Stored in directory: /home/jovyan/.cache/pip/wheels/2c/4a/61/5599631c1554768c6290b08c02c72d7317910374ca602ff1e5
Successfully built mlflow databricks-cli uuid querystring-parser tabulate configparser itsdangerous
Installing collected packages: docutils, jmespath, botocore, colorama, s3transfer, awscli, click, tabulate, configparser, databricks-cli, uuid, smmap2, gitdb2, gitpython, gunicorn, itsdangerous, Werkzeug, Flask, protobuf, boto3, querystring-parser, mlflow
  Found existing installation: protobuf 3.5.2
    Uninstalling protobuf-3.5.2:
      Successfully uninstalled protobuf-3.5.2
Successfully installed Flask-1.0.2 Werkzeug-0.14.1 awscli-1.15.51 boto3-1.7.50 botocore-1.10.50 click-6.7 colorama-0.3.9 configparser-3.5.0 databricks-cli-0.7.2 docutils-0.14 gitdb2-2.0.3 gitpython-2.1.10 gunicorn-19.9.0 itsdangerous-0.24 jmespath-0.9.3 mlflow-0.2.1 protobuf-3.6.0 querystring-parser-1.2.3 s3transfer-0.1.13 smmap2-2.0.3 tabulate-0.8.2 uuid-1.30
!ls -l mlflow
total 100
-rw-r--r--  1 jovyan 4294967294  1460 Jul  4 03:44 CHANGELOG.rst
-rw-r--r--  1 jovyan 4294967294   305 Jul  4 03:44 conftest.py
-rw-r--r--  1 jovyan 4294967294  2586 Jul  4 03:44 CONTRIBUTING.rst
-rw-r--r--  1 jovyan 4294967294   126 Jul  4 03:44 dev-requirements.txt
-rw-r--r--  1 jovyan 4294967294   372 Jul  4 03:44 Dockerfile
drwxr-sr-x  4 jovyan 4294967294  4096 Jul  4 03:44 docs
drwxr-sr-x  5 jovyan 4294967294  4096 Jul  4 03:44 example
-rwxr-xr-x  1 jovyan 4294967294   882 Jul  4 03:44 generate-protos.sh
-rw-r--r--  1 jovyan 4294967294   815 Jul  4 03:44 ISSUE_TEMPLATE.md
-rw-r--r--  1 jovyan 4294967294 11382 Jul  4 03:44 LICENSE.txt
-rwxr-xr-x  1 jovyan 4294967294   138 Jul  4 03:44 lint.sh
drwxr-sr-x 12 jovyan 4294967294  4096 Jul  4 03:44 mlflow
-rw-r--r--  1 jovyan 4294967294 16956 Jul  4 03:44 pylintrc
-rw-r--r--  1 jovyan 4294967294  2257 Jul  4 03:44 README.rst
-rw-r--r--  1 jovyan 4294967294  1828 Jul  4 03:44 setup.py
-rwxr-xr-x  1 jovyan 4294967294   330 Jul  4 03:44 test-generate-protos.sh
drwxr-sr-x 13 jovyan 4294967294  4096 Jul  4 03:44 tests
-rw-r--r--  1 jovyan 4294967294   281 Jul  4 03:44 tox.ini
-rw-r--r--  1 jovyan 4294967294   147 Jul  4 03:44 tox-requirements.txt
# The data set used in this example is from http://archive.ics.uci.edu/ml/datasets/Wine+Quality
# P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.
# Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.

import os
import warnings
import sys

import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet

import mlflow
import mlflow.sklearn


def eval_metrics(actual, pred):
    rmse = np.sqrt(mean_squared_error(actual, pred))
    mae = mean_absolute_error(actual, pred)
    r2 = r2_score(actual, pred)
    return rmse, mae, r2

准备数据

warnings.filterwarnings("ignore")
np.random.seed(40)

# Read the wine-quality csv file (make sure you're running this from the root of MLflow!)
#wine_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), "./mlflow/example/wine-quality.csv")
wine_path = "./mlflow/example/tutorial/wine-quality.csv"
data = pd.read_csv(wine_path)

# Split the data into training and test sets. (0.75, 0.25) split.
train, test = train_test_split(data)

# The predicted column is "quality" which is a scalar from [3, 9]
train_x = train.drop(["quality"], axis=1)
test_x = test.drop(["quality"], axis=1)
train_y = train[["quality"]]
test_y = test[["quality"]]

alpha = float(sys.argv[1]) if len(sys.argv) > 1 else 0.5
l1_ratio = float(sys.argv[2]) if len(sys.argv) > 2 else 0.5

拟合模型,数据预测,精度评估,记录参数。

with mlflow.start_run():
    lr = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)
    lr.fit(train_x, train_y)

    predicted_qualities = lr.predict(test_x)

    (rmse, mae, r2) = eval_metrics(test_y, predicted_qualities)

    print("Elasticnet model (alpha=%f, l1_ratio=%f):" % (alpha, l1_ratio))
    print("  RMSE: %s" % rmse)
    print("  MAE: %s" % mae)
    print("  R2: %s" % r2)

    mlflow.log_param("alpha", alpha)
    mlflow.log_param("l1_ratio", l1_ratio)
    mlflow.log_metric("rmse", rmse)
    mlflow.log_metric("r2", r2)
    mlflow.log_metric("mae", mae)

    mlflow.sklearn.log_model(lr, "model")

上面的代码还有些问题,需要MLFlow的进一步完善。

转载于:https://my.oschina.net/u/2306127/blog/1839924

你可能感兴趣的:(开发工具,人工智能)