安装kubeflow0.2.2–jupyter及tensorflow核心组件

安装kubeflow0.2.2–jupyter及tensorflow核心组件

安装ksonnet

 curl -o ks_0.9.2_linux_amd64.tar.gz http://kubeflow.oss-cn-beijing.aliyuncs.com/ks_0.9.2_linux_amd64.tar.gz
 tar -xvf ks_0.9.2_linux_amd64.tar.gz
 cp ks_0.9.2_linux_amd64/ks /usr/local/bin/
 ks version

准备github token

登录https://github.com/settings/tokens创建token。无须提供任何权限给这个token

echo "export GITHUB_TOKEN=${GITHUB_TOKEN}" >> ~/.bashrc
export GITHUB_TOKEN=你的GitHub token

安装kubeflow,此处先安装支持tensorflow的核心组件

  NAMESPACE=kubeflow
  kubectl create namespace ${NAMESPACE}
  VERSION=jupyterhub-alibaba-cloud
  APP_NAME=my-kubeflow
  ks init ${APP_NAME} --api-spec=version:v1.9.3
  cd ${APP_NAME}
  ks env set default --namespace ${NAMESPACE}
  ks registry add kubeflow github.com/cheyang/kubeflow/tree/${VERSION}/kubeflow
  ks registry list
  ks pkg install kubeflow/core@${VERSION}
  ks pkg install kubeflow/tf-serving@${VERSION}
  ks pkg install kubeflow/tf-job@${VERSION}
  ks pkg install kubeflow/tf-serving@${VERSION}
  ks pkg install kubeflow/tf-job@${VERSION}
  ks generate kubeflow-core kubeflow-core
  ks param set kubeflow-core cloud ack
  ks param set kubeflow-core jupyterHubImage registry.aliyuncs.com/kubeflow-images-public/jupyterhub-k8s:1.0.1
  ks param set kubeflow-core tfJobImage registry.cn-hangzhou.aliyuncs.com/kubeflow-images-public/tf_operator:v20180326-6214e560
  ks param set kubeflow-core tfAmbassadorImage registry.aliyuncs.com/datawire/ambassador:0.34.0
  ks param set kubeflow-core tfStatsdImage registry.aliyuncs.com/datawire/statsd:0.34.0
  ks param set kubeflow-core jupyterNotebookRegistry registry.aliyuncs.com
  ks param set kubeflow-core JupyterNotebookRepoName kubeflow-images-public
  ks param set kubeflow-core jupyterHubServiceType LoadBalancer
  ks param set kubeflow-core tfAmbassadorServiceType LoadBalancer
  ks param set kubeflow-core tfJobUiServiceType LoadBalancer
  ks pkg install kubeflow/tf-job@${VERSION}

  ks apply default -c kubeflow-core

执行完毕后查看集群pod

[root@master pipelines]# kubectl -n kubeflow get po
NAME                                              READY   STATUS             RESTARTS   AGE
ambassador-cd476cb56-jk79h                        2/2     Running            0          27h
ambassador-cd476cb56-md5x5                        2/2     Running            0          27h
ambassador-cd476cb56-qr5l4                        2/2     Running            0          27h
centraldashboard-7d45f8cbc8-vdksx                 1/1     Running            0          27h
tf-hub-0                                          1/1     Running            0          27h
tf-job-dashboard-9fd7d588-z9nc8                   1/1     Running            0          27h
tf-job-operator-8d98cd89b-vbv97                   1/1     Running            0          27h

暴露jupyter外部访问,此处使用loadbalancer

kubectl -n kubeflow edit svc tf-hub-lb

修改type为loadbalancer
spec:
  clusterIP: 10.104.28.245
  externalTrafficPolicy: Cluster
  ports:
  - name: hub
    nodePort: 32357
    port: 80
    protocol: TCP
    targetPort: 8000
  selector:
    app: tf-hub
  sessionAffinity: None
  type: LoadBalancer

获取到jupyter的外部地址

[root@master pipelines]# kubectl get service -n kubeflow |grep tf-hub-lb
tf-hub-lb                         LoadBalancer   10.104.28.245    10.18.5.30    80:32357/TCP        28h

本文测试获得的地址为10.18.5.30,使用浏览器访问http://10.18.5.30,进入jupyter
安装kubeflow0.2.2–jupyter及tensorflow核心组件_第1张图片

使用任意用户名密码登录,本文使用用户名mocktest,登录后点击【start my server】,填写server配置,选择镜像
安装kubeflow0.2.2–jupyter及tensorflow核心组件_第2张图片
确定后等待server被创建,页面跳转到jupyter
安装kubeflow0.2.2–jupyter及tensorflow核心组件_第3张图片
此时在k8s集群内查看pod,可以找到对应的server

[root@master pipelines]# kubectl get po -n kubeflow
NAME                                              READY   STATUS             RESTARTS   AGE
jupyter-mock                                      1/1     Running            0          97s

运行一段python代码,代码成功运行

使用kubeflow进行tensorflow训练

参照tf-hb-lb暴露方式暴露service ambassador,本文地址为10.18.5.32,浏览器访问http://10.18.5.32,进入tf-dashboard,选择create,填写相应信息,此处我只填写了master镜像
安装kubeflow0.2.2–jupyter及tensorflow核心组件_第4张图片
查看k8s内pod信息

[root@master pipelines]# kubectl get pod
NAME                                   READY   STATUS                     RESTARTS   AGE
2020-02-12-master-0q6y-0-cy42r         0/1     Running          0                     5m16s

可以看到master对应的pod被创建,页面上展示该job在运行中,证明安装的kubeflow支持tensorflow

安装kubeflow0.2.2–jupyter及tensorflow核心组件_第5张图片

参考:https://blog.csdn.net/weixin_33849942/article/details/89699917

你可能感兴趣的:(学习笔记)