kubeflow1.3安装记录-外网环境下

kubeflow安装在能访问外网的情况下十分简单,但是如果不能就会很困难很多,下面首先在能访问外网的情况下进行,再试着在局域网(公司网络)内进行安装。这里机器使用腾讯云服务器,选择2台香港地区的竞价实例,CentOS7.6,配置为8核16G,根目录磁盘空间300G,网络为1Mbps的按带宽计费模式,2台机器共计每小时¥1.23。安全组选择放通所有端口,登录方式为自定义密码登录

安装docker

首先在两台机器上安装docker

yum -y install yum-utils && \
yum-config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo && \
yum install -y https://download.docker.com/linux/centos/7/x86_64/stable/Packages/containerd.io-1.4.3-3.1.el7.x86_64.rpm && \
yum install docker-ce -y
systemctl --now enable docker

yum install -y yum-utils device-mapper-persistent-data lvm2 git

如果因为网络问题无法使用yum安装的话,可以到https://download.docker.com/linux/static/stable/x86_64/ 下载tar.gz文件,解压后将docker/目录下全部文件复制到 /usr/bin 目录下,创建启动文件 /etc/systemd/system/docker.service ,内容如下

[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target

[Service]
Type=notify
ExecStart=/usr/bin/dockerd --selinux-enabled=false
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TimeoutStartSec=0
Delegate=yes
KillMode=process
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s

[Install]
WantedBy=multi-user.target

执行 systemctl daemon-reload && systemctl restart docker 完成docker的安装

PS:默认情况下,docker会将镜像文件存放在 /var 目录下,如果该目录空间不足,docker会自动删除某些镜像,可以在启动中指定 --graph=/root/images 来自定义镜像文件存放的位置

安装Harbor

在kubeflow的使用过程中,我们需要一个私有的镜像仓库来存放执行过程中生成的镜像文件,这里我们选择部署一套harbor作为镜像仓库,也可以选择docker registry,选择一台机器操作即可

安装docker-compose:

wget https://github.com/docker/compose/releases/download/1.29.2/docker-compose-Linux-x86_64
mv docker-compose-Linux-x86_64 docker-compose
chmod +x docker-compose
mv docker-compose /usr/bin/

可执行 docker-compose version 验证是否安装成功

wget https://github.com/goharbor/harbor/releases/download/v1.9.2/harbor-offline-installer-v1.9.2.tgz
tar -zxvf harbor-offline-installer-v1.9.2.tgz
cd harbor
vim harbor.yaml
./install.sh

修改harbor.yaml中的hostname和port

kubeflow1.3安装记录-外网环境下_第1张图片

 访问 http://43.128.14.116:86/,即可打开harbor页面,43.128.14.116为服务器外网地址、172.19.0.14为其内网地址,默认账号admin,密码为 Harbor12345。登录之后创建一个名为kubeflow的公开项目作为我们的私有仓库。

修改docker配置文件,/etc/docker/daemon.json,加入如下内容,之后重启docker

{ "insecure-registries": ["172.19.0.14:86"] }

docker login -u admin -p Harbor12345 172.19.0.14:86

kubeflow1.3安装记录-外网环境下_第2张图片

harbor搭建完成 ,docker-compose up -d 重启

安装Kubernetes

使用kubeadm来安装k8s,kubeflow1.3要求k8s的版本最低为1.15,推荐1.17+,这里选择1.19版本进行安装

sudo setenforce 0         #关闭selinux
sudo swapoff -a           #关闭内存交换

#开启ipvs
modprobe br_netfilter
cat > /etc/sysconfig/modules/ipvs.modules << EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
EOF

chmod  755  /etc/sysconfig/modules/ipvs.modules &&  bash  /etc/sysconfig/modules/ipvs.modules &&  lsmod | grep -e ip_vs

#更新yum源,创建 /etc/yum.repos.d/kubernetes.repo(所有节点)
cat > /etc/yum.repos.d/kubernetes.repo  << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

yum makecache fast -y 

#调整iptables
echo 1 >/proc/sys/net/bridge/bridge-nf-call-iptables    
echo 1 > /proc/sys/net/ipv4/ip_forward

#修改环境变量/etc/profile,加入
export KUBECONFIG=/etc/kubernetes/admin.conf
export GODEBUG=x509ignoreCN=0

source /etc/profile

yum install -y kubelet-1.19.1  kubeadm-1.19.1  kubectl-1.19.1

systemctl enable kubelet && systemctl restart kubelet

选择一台机器作为k8s集群master,进行初始化设置

kubeadm config print init-defaults > kubeadm-config.yaml

编辑 kubeadm-config.yaml ,修改advertiseAddress为本机内网ip,在 networking 模块下加入

podSubnet: 10.244.0.0/16

kubeadm init --config=kubeadm-config.yaml --upload-certs |tee kubeadm-init.log

在另一台机器上执行 kubeadm join 加入集群,之后进行k8s集群的配置工作

#配置flannel
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
kubectl apply -f kube-flannel.yml

#重启pod
kubectl get pod -n kube-system | grep kube-proxy |awk '{system("kubectl delete pod "$1" -n kube-system")}'

#使master也能当做node使用
kubectl taint node vm-0-14-centos node-role.kubernetes.io/master-

修改k8s配置文件,进入 /etc/kubernetes/manifests/

修改 kube-apiserver.yaml,加入如下内容

- --service-account-signing-key-file=/etc/kubernetes/pki/sa.key

- --service-account-issuer=kubernetes.default.svc

修改 kube-controller-manager.yaml 、kube-scheduler.yaml,将 - --port=0 注释掉

重启kubelet:systemctl restart kubelet

kubeflow1.3安装记录-外网环境下_第3张图片

设置NFS、创建PV

在安装kubeflow之前,首先设置nfs并创建pv,安装kubeflow时mysq、katib等组件时需要使用pvc,如果不提前创建,会导致这些节点状态为pending

yum install -y nfs-utils rpcbind 选择一台机器作为nfs服务器

#创建共享文件夹
mkdir /root/nfs-kubeflow
cd /root/nfs-kubeflow
mkdir v1
mkdir v2
mkdir v3
mkdir v4
mkdir v5
#配置nfs
vim /etc/exports
#加入如下内容:/root/nfs-kubeflow *(insecure,rw,no_root_squash,no_all_squash,sync)
#使配置生效
exportfs  -r 
#启动服务rpcbind、nfs服务
service rpcbind  start  
service nfs  start  

kubeflow1.3安装记录-外网环境下_第4张图片

配置pv文件 pv.yaml,加入如下内容,注意替换path与server

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv001
  labels:
    name: pv001
spec:
  nfs:
    path: /root/nfs-kubeflow/v1
    server: 172.19.0.14
  accessModes: ["ReadWriteMany","ReadWriteOnce"]
  capacity:
    storage: 15Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv002
  labels:
    name: pv002
spec:
  nfs:
    path: /root/nfs-kubeflow/v2
    server: 172.19.0.14
  accessModes: ["ReadWriteMany","ReadWriteOnce"]
  capacity:
    storage: 25Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv003
  labels:
    name: pv003
spec:
  nfs:
    path: /root/nfs-kubeflow/v3
    server: 172.19.0.14
  accessModes: ["ReadWriteMany","ReadWriteOnce"]
  capacity:
    storage: 25Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv004
  labels:
    name: pv004
spec:
  nfs:
    path: /root/nfs-kubeflow/v4
    server: 172.19.0.14
  accessModes: ["ReadWriteMany","ReadWriteOnce"]
  capacity:
    storage: 25Gi

执行 kubectl apply -f pv.yaml 创建 pv

kubeflow1.3安装记录-外网环境下_第5张图片

安装kubeflow

wget https://github.com/kubeflow/manifests/archive/refs/tags/v1.3.0.zip
unzip v1.3.0.zip
wget https://github.com/kubernetes-sigs/kustomize/releases/download/v3.2.0/kustomize_3.2.0_linux_amd64
mv kustomize_3.2.0_linux_amd64 kustomize
chmod +x kustomize
mv kustomize /usr/bin/
cd manifests-1.3.0/

#安装istio
kustomize build common/istio-1-9-0/istio-crds/base | kubectl apply -f -
kustomize build common/istio-1-9-0/istio-namespace/base | kubectl apply -f -
kustomize build common/istio-1-9-0/istio-install/base | kubectl apply -f -

#安装cert-manager
kustomize build common/cert-manager/cert-manager-kube-system-resources/base | kubectl apply -f -
kustomize build common/cert-manager/cert-manager-crds/base | kubectl apply -f -
kustomize build common/cert-manager/cert-manager/overlays/self-signed | kubectl apply -f -

#安装dex
kustomize build common/dex/overlays/istio | kubectl apply -f -

#安装oidc
kustomize build common/oidc-authservice/base | kubectl apply -f -

#安装knative-serving
kustomize build common/knative/knative-serving-crds/base | kubectl apply -f -
kustomize build common/knative/knative-serving-install/base | kubectl apply -f -
kustomize build common/istio-1-9-0/cluster-local-gateway/base | kubectl apply -f -

#安装knative-eventing
kustomize build common/knative/knative-eventing-crds/base | kubectl apply -f -
kustomize build common/knative/knative-eventing-install/base | kubectl apply -f -

#创建kubeflow namespace
kustomize build common/kubeflow-namespace/base | kubectl apply -f -

#创建kubeflow-roles
kustomize build common/kubeflow-roles/base | kubectl apply -f -

#创建istio-resource
kustomize build common/istio-1-9-0/kubeflow-istio-resources/base | kubectl apply -f -

#安装pipeline
kustomize build apps/pipeline/upstream/env/platform-agnostic-multi-user | kubectl apply -f -

#安装kfserving
kustomize build apps/kfserving/upstream/overlays/kubeflow | kubectl apply -f -

#安装katib
kustomize build apps/katib/upstream/installs/katib-with-kubeflow | kubectl apply -f -

#安装kubeflow dashboard
kustomize build apps/centraldashboard/upstream/overlays/istio | kubectl apply -f -

#安装admission-webhook
kustomize build apps/admission-webhook/upstream/overlays/cert-manager | kubectl apply -f -

#安装notebook
kustomize build apps/jupyter/notebook-controller/upstream/overlays/kubeflow | kubectl apply -f -

#安装jupyter
kustomize build apps/jupyter/jupyter-web-app/upstream/overlays/istio | kubectl apply -f -

#安装kfam
kustomize build apps/profiles/upstream/overlays/kubeflow | kubectl apply -f -

#安装volume
kustomize build apps/volumes-web-app/upstream/overlays/istio | kubectl apply -f -

#安装tensorboard
kustomize build apps/tensorboard/tensorboards-web-app/upstream/overlays/istio | kubectl apply -f -
kustomize build apps/tensorboard/tensorboard-controller/upstream/overlays/kubeflow | kubectl apply -f -

#安装各种operator
kustomize build apps/tf-training/upstream/overlays/kubeflow | kubectl apply -f -
kustomize build apps/pytorch-job/upstream/overlays/kubeflow | kubectl apply -f -
kustomize build apps/mpi-job/upstream/overlays/kubeflow | kubectl apply -f -
kustomize build apps/mxnet-job/upstream/overlays/kubeflow | kubectl apply -f -
kustomize build apps/xgboost-job/upstream/overlays/kubeflow | kubectl apply -f -

#创建namespace
kustomize build common/user-namespace/base | kubectl apply -f -

等待所有pod均为running状态

kubeflow1.3安装记录-外网环境下_第6张图片

kubeflow1.3安装记录-外网环境下_第7张图片

暴露指定端口30000,编辑kubeflow-ui-nodeport.yaml

apiVersion: v1
kind: Service
metadata:
  labels:
    app: istio-ingressgateway
    install.operator.istio.io/owning-resource: unknown
    istio: ingressgateway
    istio.io/rev: default
    operator.istio.io/component: IngressGateways
    release: istio
  name: istio-ingressgateway
  namespace: istio-system
spec:
  ports:
  - name: status-port
    port: 15021
    protocol: TCP
    targetPort: 15021
  - name: http2
    port: 80
    protocol: TCP
    targetPort: 8080
    nodePort: 30000
  - name: https
    port: 443
    protocol: TCP
    targetPort: 8443
  - name: tcp
    port: 31400
    protocol: TCP
    targetPort: 31400
  - name: tls
    port: 15443
    protocol: TCP
    targetPort: 15443
  selector:
    app: istio-ingressgateway
    istio: ingressgateway
  type: NodePort

执行 kubectl apply -f kubeflow-nodeport.yaml

到这里kubeflow1.3的部署就完成了,可通过 http://ip:30000 来访问,默认账号为 [email protected],密码为12341234

设置 storageClass 

默认情况下,创建notebook server会报如下错误,原因是没有配置默认的StorageClass

kubeflow1.3安装记录-外网环境下_第8张图片

编辑storage-nfs.yaml,注意替换NFS_SERVER、NFS_PATH内容

apiVersion: v1
kind: ServiceAccount
metadata:
  name: nfs-client-provisioner
---
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
  name: nfs-provisioner
spec:
  replicas: 1
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: nfs-provisioner
    spec:
      serviceAccount: nfs-client-provisioner
      containers:
        - name: nfs-provisioner
          image: registry.cn-hangzhou.aliyuncs.com/open-ali/nfs-client-provisioner
          volumeMounts:
            - name: nfs-client-root
              mountPath: /persistentvolumes
          env:
            - name: PROVISIONER_NAME
              value: kubeflow/nfs
            - name: NFS_SERVER
              value: 172.19.0.14
            - name: NFS_PATH
              value: /root/nfs-kubeflow/v5
      volumes:
        - name: nfs-client-root
          nfs:
            server: 172.19.0.14
            path: /root/nfs-kubeflow/v5

编辑storage-rbac.yaml

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: run-nfs-client-provisioner
subjects:
  - kind: ServiceAccount
    name: nfs-client-provisioner
    namespace: kubeflow-user-example-com
roleRef:
  kind: ClusterRole
  name: nfs-client-provisioner-runner
  apiGroup: rbac.authorization.k8s.io
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: nfs-client-provisioner-runner
rules:
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["events"]
verbs: ["list", "watch", "create", "update", "patch"]

编辑storage-class.yaml

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: kubeflow-nfs-storage
provisioner: kubeflow/nfs

 依次执行如下命令

kubectl apply -f storage-nfs.yaml
kubectl apply -f storage-rbac.yaml
kubectl apply -f storage-class.yaml
kubectl patch storageclass kubeflow-nfs-storage -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

刷新页面之后,不再报错

设置Https访问

如果要通过 NodePort / LoadBalancer / Ingress 暴露服务到非 localhost 网络,那么必须使用 https。否则能打开主页面但是 notebook、ssh等均无法连接。创建生成 ssl 自签名证书脚本 create_self-signed-cert.sh

#!/bin/bash -e

help ()
{
    echo  ' ================================================================ '
    echo  ' --ssl-domain: 生成ssl证书需要的主域名,如不指定则默认为www.rancher.local,如果是ip访问服务,则可忽略;'
    echo  ' --ssl-trusted-ip: 一般ssl证书只信任域名的访问请求,有时候需要使用ip去访问server,那么需要给ssl证书添加扩展IP,多个IP用逗号隔开;'
    echo  ' --ssl-trusted-domain: 如果想多个域名访问,则添加扩展域名(SSL_TRUSTED_DOMAIN),多个扩展域名用逗号隔开;'
    echo  ' --ssl-size: ssl加密位数,默认2048;'
    echo  ' --ssl-cn: 国家代码(2个字母的代号),默认CN;'
    echo  ' 使用示例:'
    echo  ' ./create_self-signed-cert.sh --ssl-domain=www.test.com --ssl-trusted-domain=www.test2.com \ '
    echo  ' --ssl-trusted-ip=1.1.1.1,2.2.2.2,3.3.3.3 --ssl-size=2048 --ssl-date=3650'
    echo  ' ================================================================'
}

case "$1" in
    -h|--help) help; exit;;
esac

if [[ $1 == '' ]];then
    help;
    exit;
fi

CMDOPTS="$*"
for OPTS in $CMDOPTS;
do
    key=$(echo ${OPTS} | awk -F"=" '{print $1}' )
    value=$(echo ${OPTS} | awk -F"=" '{print $2}' )
    case "$key" in
        --ssl-domain) SSL_DOMAIN=$value ;;
        --ssl-trusted-ip) SSL_TRUSTED_IP=$value ;;
        --ssl-trusted-domain) SSL_TRUSTED_DOMAIN=$value ;;
        --ssl-size) SSL_SIZE=$value ;;
        --ssl-date) SSL_DATE=$value ;;
        --ca-date) CA_DATE=$value ;;
        --ssl-cn) CN=$value ;;
    esac
done

# CA相关配置
CA_DATE=${CA_DATE:-3650}
CA_KEY=${CA_KEY:-cakey.pem}
CA_CERT=${CA_CERT:-cacerts.pem}
CA_DOMAIN=cattle-ca

# ssl相关配置
SSL_CONFIG=${SSL_CONFIG:-$PWD/openssl.cnf}
SSL_DOMAIN=${SSL_DOMAIN:-'www.rancher.local'}
SSL_DATE=${SSL_DATE:-3650}
SSL_SIZE=${SSL_SIZE:-2048}

## 国家代码(2个字母的代号),默认CN;
CN=${CN:-CN}

SSL_KEY=$SSL_DOMAIN.key
SSL_CSR=$SSL_DOMAIN.csr
SSL_CERT=$SSL_DOMAIN.crt

echo -e "\033[32m ---------------------------- \033[0m"
echo -e "\033[32m       | 生成 SSL Cert |       \033[0m"
echo -e "\033[32m ---------------------------- \033[0m"

if [[ -e ./${CA_KEY} ]]; then
    echo -e "\033[32m ====> 1. 发现已存在CA私钥,备份"${CA_KEY}"为"${CA_KEY}"-bak,然后重新创建 \033[0m"
    mv ${CA_KEY} "${CA_KEY}"-bak
    openssl genrsa -out ${CA_KEY} ${SSL_SIZE}
else
    echo -e "\033[32m ====> 1. 生成新的CA私钥 ${CA_KEY} \033[0m"
    openssl genrsa -out ${CA_KEY} ${SSL_SIZE}
fi

if [[ -e ./${CA_CERT} ]]; then
    echo -e "\033[32m ====> 2. 发现已存在CA证书,先备份"${CA_CERT}"为"${CA_CERT}"-bak,然后重新创建 \033[0m"
    mv ${CA_CERT} "${CA_CERT}"-bak
    openssl req -x509 -sha256 -new -nodes -key ${CA_KEY} -days ${CA_DATE} -out ${CA_CERT} -subj "/C=${CN}/CN=${CA_DOMAIN}"
else
    echo -e "\033[32m ====> 2. 生成新的CA证书 ${CA_CERT} \033[0m"
    openssl req -x509 -sha256 -new -nodes -key ${CA_KEY} -days ${CA_DATE} -out ${CA_CERT} -subj "/C=${CN}/CN=${CA_DOMAIN}"
fi

echo -e "\033[32m ====> 3. 生成Openssl配置文件 ${SSL_CONFIG} \033[0m"
cat > ${SSL_CONFIG} <> ${SSL_CONFIG} <> ${SSL_CONFIG}
    done

    if [[ -n ${SSL_TRUSTED_IP} ]]; then
        ip=(${SSL_TRUSTED_IP})
        for i in "${!ip[@]}"; do
          echo IP.$((i+1)) = ${ip[$i]} >> ${SSL_CONFIG}
        done
    fi
fi

echo -e "\033[32m ====> 4. 生成服务SSL KEY ${SSL_KEY} \033[0m"
openssl genrsa -out ${SSL_KEY} ${SSL_SIZE}

echo -e "\033[32m ====> 5. 生成服务SSL CSR ${SSL_CSR} \033[0m"
openssl req -sha256 -new -key ${SSL_KEY} -out ${SSL_CSR} -subj "/C=${CN}/CN=${SSL_DOMAIN}" -config ${SSL_CONFIG}

echo -e "\033[32m ====> 6. 生成服务SSL CERT ${SSL_CERT} \033[0m"
openssl x509 -sha256 -req -in ${SSL_CSR} -CA ${CA_CERT} \
    -CAkey ${CA_KEY} -CAcreateserial -out ${SSL_CERT} \
    -days ${SSL_DATE} -extensions v3_req \
    -extfile ${SSL_CONFIG}

echo -e "\033[32m ====> 7. 证书制作完成 \033[0m"
echo
echo -e "\033[32m ====> 8. 以YAML格式输出结果 \033[0m"
echo "----------------------------------------------------------"
echo "ca_key: |"
cat $CA_KEY | sed 's/^/  /'
echo
echo "ca_cert: |"
cat $CA_CERT | sed 's/^/  /'
echo
echo "ssl_key: |"
cat $SSL_KEY | sed 's/^/  /'
echo
echo "ssl_csr: |"
cat $SSL_CSR | sed 's/^/  /'
echo
echo "ssl_cert: |"
cat $SSL_CERT | sed 's/^/  /'
echo

echo -e "\033[32m ====> 9. 附加CA证书到Cert文件 \033[0m"
cat ${CA_CERT} >> ${SSL_CERT}
echo "ssl_cert: |"
cat $SSL_CERT | sed 's/^/  /'
echo

echo -e "\033[32m ====> 10. 重命名服务证书 \033[0m"
echo "cp ${SSL_DOMAIN}.key tls.key"
cp ${SSL_DOMAIN}.key tls.key
echo "cp ${SSL_DOMAIN}.crt tls.crt"
cp ${SSL_DOMAIN}.crt tls.crt
chmod +x create_self-signed-cert.sh

./create_self-signed-cert.sh --ssl-domain=kubeflow.cn

kubectl create --namespace istio-system secret tls kf-tls-cert --key /root/ssl/kubeflow.cn.key --cert /root/ssl/kubeflow.cn.crt

kubectl edit cm config-domain --namespace knative-serving

#在 data 下面添加:kubeflow.cn: ""

编辑kubeflow-https.yaml

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: kubeflow-gateway
  namespace: kubeflow
spec:
  selector:
    istio: ingressgateway
  servers:
  - hosts:
    - '*'
    port:
      name: http
      number: 80
      protocol: HTTP
  - hosts:
    - '*'
    port:
      name: https
      number: 443
      protocol: HTTPS
    tls:
      mode: SIMPLE
      credentialName: kf-tls-cert

执行 kubectl apply -f kubeflow-https.yaml

执行 kubectl -n istio-system get service istio-ingressgateway 获取https端口为32143

可通过 https://ip:32143 来进行 https 访问

Multi Users 隔离设置

kubeflow安装结束后,默认情况下只能通过 [email protected] 一个账号登录操作,下面进行多账户号设置,设置完成后,每个账户拥有独立的 namespace与profile

首先创建用户的 profile,下面的yaml文件创建了2个profile,namespace分别为test1与test2

apiVersion: kubeflow.org/v1beta1
kind: Profile
metadata:
  name: test1
spec:
  owner:
    kind: User
    name: [email protected]
---
apiVersion: kubeflow.org/v1beta1
kind: Profile
metadata:
  name: test2
spec:
  owner:
    kind: User
    name: [email protected]

执行 kubectl get configmap dex -n auth -o jsonpath='{.data.config\.yaml}' > dex-yaml.yaml 将dex的用户信息输出到dex-yaml.yaml文件,其中[email protected] 部署是按照时创建的,编辑该文件,添加test1与test2部分

kubeflow1.3安装记录-外网环境下_第9张图片

执行 kubectl create configmap dex --from-file=config.yaml=dex-yaml.yaml -n auth --dry-run=client -o yaml | kubectl apply -f - 使新用户生效 

执行 kubectl rollout restart deployment dex -n auth 重启dex,这样就完成了多用户的设置

你可能感兴趣的:(kubeflow,k8s,docker,ai,人工智能)