云原生 = 微服务 + DevOps + 持续交付 + 容器化。
为什么选择云原生:
云原生核心优势:
解耦软件开发,提高灵活性和可维护性
多云支持,避免厂商锁定
避免侵入式定制
提高工作效率和资源利用率
传统 | 云原生 | |
---|---|---|
重点关注 | 使用寿命和稳定性 | 上市速度 |
开发方法 | 瀑布式半敏捷型开发 | 敏捷开发、DevOps |
团队 | 相互独立的开发、运维、质量保证和安全团队 | 协作式 DevOps 团队 |
交付周期 | 长 | 短且持续 |
应用架构 | 紧密耦合 单体式 |
松散耦合 基于服务 基于应用编程接口(API)的通信 |
基础架构 | 以服务器为中心 适用于企业内部 依赖于基础架构 纵向扩展 针对峰值容量预先进行置备 |
以容器为中适用于企业内部和云环境 可跨基础架构进行移植 横向扩展 按需提供容量 |
顾名思义,DevOps就是开发(Development)与运维(Operations)的结合体,其目的就是打通开发与运维之间的壁垒,促进开发、运营和质量保障(QA)等部门之间的沟通协作,以便对产品进行小规模、快速迭代式地开发和部署,快速响应客户的需求变化。它强调的是开发运维一体化,加强团队间的沟通和快速反馈,达到快速交付产品和提高交付质量的目的。
DevOps并不是一种新的工具集,而是一种思想,一种文化,用以改变传统开发运维模式的一组最佳实践。一般做法是通过一些CI/CD(持续集成、持续部署)自动化的工具和流程来实现DevOps的思想,以流水线(pipeline)的形式改变传统开发人员和测试人员发布软件的方式。随着Docker和Kubernetes(以下简称k8s)等技术的普及,容器云平台基础设施越来越完善,加速了开发和运维角色的融合,使云原生的DevOps实践成为以后的趋势。下面我们基于混合容器云平台详细讲解下云平台下DevOps的落地方案。
云原生DevOps特点:
DevOps是PaaS平台里很关键的功能模块,包含以下重要能力:支持代码克隆、编译代码、运行脚本、构建发布镜像、部署yaml文件以及部署Helm应用等环节;支持丰富的流水线设置,比如资源限额、流水线运行条数、推送代码以及推送镜像触发流水线运行等,提供了用户在不同环境下的端到端高效流水线能力;提供开箱即用的镜像仓库中心;提供流水线缓存功能,可以自由配置整个流水线或每个步骤的运行缓存,在代码克隆、编译代码、构建镜像等步骤时均可利用缓存大大缩短运行时间,提升执行效率。具体功能清单如下:
云原生DevOps实践:
云原生应用开发所构建和运行的应用,旨在充分利用基于四大原则的云计算模型:
基于服务的架构:基于服务的架构(如微服务)提倡构建松散耦合的模块化服务。采用基于服务的松散耦合设计,可帮助企业提高应用创建速度,降低复杂性。
基于API 的通信:即通过轻量级 API 来进行服务之间的相互调用。通过API驱动的方式,企业可以通过所提供的API 在内部和外部创建新的业务功能,极大提升了业务的灵活性。此外,采用基于API 的设计,在调用服务时可避免因直接链接、共享内存模型或直接读取数据存储而带来的风险。
基于容器的基础架构:云原生应用依靠容器来构建跨技术环境的通用运行模型,并在不同的环境和基础架构(包括公有云、私有云和混合云)间实现真正的应用可移植性。此外,容器平台有助于实现云原生应用的弹性扩展。
基于DevOps流程:采用云原生方案时,企业会使用敏捷的方法、依据持续交付和DevOps 原则来开发应用。这些方法和原则要求开发、质量保证、安全、IT运维团队以及交付过程中所涉及的其他团队以协作方式构建和交付应用。
微服务发布流程:
由管理平台实现两种方式发布管理:
1)手动上传资源包流程
2)实现自动化CI/CD流程
微服务平台架构设计:
基础服务平台设计:
kubernetes在云原生中的地位:
kubernetes架构图:
1. master
2. node
每个Node 节点上都运行着以下一组关键进程:
3. pod
Pod是kubernetes中可以创建的最小部署单元,V1 core版本的Pod的配置模板见Pod Template。
创建一个tomcat实例:
apiVersion: vl
kind: Pod
metadata:
name: myweb
labels:
name: myweb
spec:
containers:
name: myweb
image: kubeguide/tomcat-app:vl
ports:
containerPort: 8080
env:
name: MYSQL_SERVICE_HOST
value: 'mysql'
name: MYSQL_SERVICE_PORT
value: '3306'
4. label(标签)
一个label是一个key=value的简直对,其中key与value由用户自己指定。Label可以附加到各种资源对象上,列入Node、pod、Server、RC 等。
Label 示例如下:
版本标签:release:stable,release:canary
环境标签:environment:dev,environment:qa,environment:production
架构标签: tier:frontend,tier:backend,tier:cache
分区标签:partition:customerA,partition:customerB
质量管控标签:track:daily,track:weekly
Label selector:
Label不是唯一的,很多object可能有相同的label。
通过label selector,客户端/用户可以指定一个object集合,通过label selector对object的集合进行操作。
示例:
$ kubectl get pods -l 'environment=production,tier=frontend'
$ kubectl get pods -l 'environment in (production),tier in (frontend)'
$ kubectl get pods -l 'environment in (production, qa)'
$ kubectl get pods -l 'environment,environment notin (frontend)'
Label Selector 的作用范围1:
Label Selector 的作用范围2:
5. Replication Contoller
定义了一个期望的场景,即声明某种Pod的副本数量在任意时刻都符合某个预期值,所以RC的定义包含如下几个场景:
一个完整的RC示例:
apiVersion: extensions/v1beta1
kind: ReplicaSet
metadata:
name: frontend
# these labels can be applied automatically
# from the labels in the pod template if not set
# labels:
# app: guestbook
# tier: frontend
spec:
# this replicas value is default
# modify it according to your case
replicas: 3
# selector can be applied automatically
# from the labels in the pod template if not set,
# but we are specifying the selector here to
# demonstrate its usage.
selector:
matchLabels:
tier: frontend
matchExpressions:
- {key: tier, operator: In, values: [frontend]}
template:
metadata:
labels:
app: guestbook
tier: frontend
spec:
containers:
- name: php-redis
image: gcr.io/google_samples/gb-frontend:v3
resources:
requests:
cpu: 100m
memory: 100Mi
env:
- name: GET_HOSTS_FROM
value: dns
# If your cluster config does not include a dns service, then to
# instead access environment variables to find service host
# info, comment out the 'value: dns' line above, and uncomment the
# line below.
# value: env
ports:
- containerPort: 80
6. Deployment
Deployment 为 Pod 和 ReplicaSet 提供了一个声明式定义(declarative)方法,用来替代以前的ReplicationController 来方便的管理应用。
典型的应用场景包括:
一个简单的nginx 应用可以定义为:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 3
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
7. Horizontal Pod Autoscaler
应用的资源使用率通常都有高峰和低谷的时候,如何削峰填谷,提高集群的整体资源利用率,让service中的Pod个数自动调整呢?
这就有赖于Horizontal Pod Autoscaling了,顾名思义,使Pod水平自动缩放。这个Object(跟Pod、Deployment一样都是API resource)也是最能体现kubernetes之于传统运维价值的地方,不再需要手动扩容了,终于实现自动化了,还可以自定义指标,没准未来还可以通过人工智能自动进化呢!
Metrics支持:
在不同版本的API中,HPA autoscale时可以根据以下指标来判断:
一个简单的HPA示例:
apiVersion: autoscaling/vl
kind: HorizontalPodAutoscaler
metadata:
name: php-apache
namespace: default
spec:
maxReplicas: 5
minReplicas: 2
scaleTargetRef:
kind: Deployment
name: php-apache
targetCPUUtilizationPercentage: 90
8. Service
Kubernetes Service 定义了这样一种抽象:一个 Pod 的逻辑分组,一种可以访问它们的策略 —— 通常称为微服务。 这一组 Pod 能够被 Service 访问到,通常是通过 Label Selector(查看下面了解,为什么可能需要没有 selector 的 Service)实现的。
Pod、RC 与Service的关系:
Endpoint=Pod IP + Container Port。
一个service示例:
kind: Service
apiVersion: v1
metadata:
name: my-service
spec:
selector:
app: MyApp
ports:
- protocol: TCP
port: 80
targetPort: 9376
1)kubernetes的服务发现机制
Kubernetes支持2种基本的服务发现模式 —— 环境变量和DNS。
当 Pod 运行在 Node 上,kubelet 会为每个活跃的 Service 添加一组环境变量。 它同时支持 Docker links 兼容 变量(查看 makeLinkVariables)、简单的 {SVCNAME}_SERVICE_HOST 和 {SVCNAME}_SERVICE_PORT 变量,这里 Service 的名称需大写,横线被转换成下划线。
举个例子,一个名称为 "redis-master" 的 Service 暴露了 TCP 端口 6379,同时给它分配了 Cluster IP 地址 10.0.0.11,这个 Service 生成了如下环境变量:
REDIS_MASTER_SERVICE_HOST=10.0.0.11
REDIS_MASTER_SERVICE_PORT=6379
REDIS_MASTER_PORT=tcp://10.0.0.11:6379
REDIS_MASTER_PORT_6379_TCP=tcp://10.0.0.11:6379
REDIS_MASTER_PORT_6379_TCP_PROTO=tcp
REDIS_MASTER_PORT_6379_TCP_PORT=6379
REDIS_MASTER_PORT_6379_TCP_ADDR=10.0.0.11
DNS
kubernets 通过Add-On增值包的方式引入了DNS 系统,把服务名称作为DNS域名,这样一来,程序就可以直接使用服务名来建立通信连接。目前kubernetes上的大部分应用都已经采用了DNS这一种发现机制,在后面的章节中我们会讲述如何部署与使用这套DNS系统。
2)外部系统访问service的问题
为了更加深刻的理解和掌握Kubernetes,我们需要弄明白kubernetes里面的“三种IP”这个关键问题,这三种IP 分别如下:
3)服务类型
对一些应用(如 Frontend)的某些部分,可能希望通过外部(Kubernetes 集群外部)IP 地址暴露 Service。
Kubernetes ServiceTypes 允许指定一个需要的类型的 Service,默认是 ClusterIP 类型。
Type 的取值以及行为如下:
一个Node Port 的简单示例:
apiVersion: vl
kind: Service
metadata;
name: tomcat-service
spec:
type: NodePort
ports:
- port: 8080
nodePort: 31002
selector:
tier: frontend
9. Volume 存储卷
我们知道默认情况下容器的数据都是非持久化的,在容器消亡以后数据也跟着丢失,所以Docker提供了Volume机制以便将数据持久化存储。类似的,Kubernetes提供了更强大的Volume机制和丰富的插件,解决了容器数据持久化和容器间共享数据的问题。
1)Volume
目前,Kubernetes支持以下Volume类型:
- emptyDir
- hostPath
- gcePersistentDisk
- awsElasticBlockStore
- nfs
- iscsi
- flocker
- glusterfs
- rbd
- cephfs
- gitRepo
- secret
- persistentVolumeClaim
- downwardAPI
- azureFileVolume
- vsphereVolume
- flexvolume
注意,这些volume并非全部都是持久化的,比如emptyDir、secret、gitRepo等,这些volume会随着Pod的消亡而消失。
2)PersistentVolume
对于持久化的Volume,PersistentVolume (PV)和PersistentVolumeClaim (PVC)提供了更方便的管理卷的方法:PV提供网络存储资源,而PVC请求存储资源。这样,设置持久化的工作流包括配置底层文件系统或者云数据卷、创建持久性数据卷、最后创建claim来将pod跟数据卷关联起来。PV和PVC可以将pod和数据卷解耦,pod不需要知道确切的文件系统或者支持它的持久化引擎。
3)PV
PersistentVolume(PV)是集群之中的一块网络存储。跟 Node 一样,也是集群的资源。PV 跟 Volume (卷) 类似,不过会有独立于 Pod 的生命周期。比如一个本地的PV可以定义为
kind: PersistentVolume
apiVersion: v1
metadata:
name: grafana-pv-volume
labels:
type: local
spec:
storageClassName: grafana-pv-volume
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Recycle
hostPath:
path: "/data/volume/grafana"
pv的访问模式有三种:
4)StorageClass
上面通过手动的方式创建了一个NFS Volume,这在管理很多Volume的时候不太方便。Kubernetes还提供了StorageClass来动态创建PV,不仅节省了管理员的时间,还可以封装不同类型的存储供PVC选用。
Ceph RBD的例子:
apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
name: fast
provisioner: kubernetes.io/rbd
parameters:
monitors: 10.16.153.105:6789
adminId: kube
adminSecretName: ceph-secret
adminSecretNamespace: kube-system
pool: kube
userId: kube
userSecretName: ceph-secret-user
5)PVC
PV是存储资源,而PersistentVolumeClaim (PVC) 是对PV的请求。PVC跟Pod类似:Pod消费Node的源,而PVC消费PV资源;Pod能够请求CPU和内存资源,而PVC请求特定大小和访问模式的数据卷。
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: grafana-pvc-volume
namespace: "monitoring"
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: grafana-pv-volume
PVC可以直接挂载到Pod中:
kind: Pod
apiVersion: v1
metadata:
name: mypod
spec:
containers:
- name: myfrontend
image: dockerfile/nginx
volumeMounts:
- mountPath: "/var/www/html"
name: mypd
volumes:
- name: mypd
persistentVolumeClaim:
claimName: grafana-pvc-volume
6)其他volume 说明
① NFS
NFS 是Network File System的缩写,即网络文件系统。Kubernetes中通过简单地配置就可以挂载NFS到Pod中,而NFS中的数据是可以永久保存的,同时NFS支持同时写操作。
volumes:
- name: nfs
nfs:
# FIXME: use the right hostname
server: 10.254.234.223
path: "/data"
② emptyDir
如果Pod配置了emptyDir类型Volume, Pod 被分配到Node上时候,会创建emptyDir,只要Pod运行在Node上,emptyDir都会存在(容器挂掉不会导致emptyDir丢失数据),但是如果Pod从Node上被删除(Pod被删除,或者Pod发生迁移),emptyDir也会被删除,并且永久丢失。
apiVersion: v1
kind: Pod
metadata:
name: test-pd
spec:
containers:
- image: gcr.io/google_containers/test-webserver
name: test-container
volumeMounts:
- mountPath: /test-pd
name: test-volume
volumes:
- name: test-volume
emptyDir: {}
10. namespace
在一个Kubernetes集群中可以使用namespace创建多个“虚拟集群”,这些namespace之间可以完全隔离,也可以通过某种方式,让一个namespace中的service可以访问到其他的namespace中的服务,我们在CentOS中部署kubernetes1.6集群的时候就用到了好几个跨越namespace的服务,比如Traefik ingress和kube-systemnamespace下的service就可以为整个集群提供服务,这些都需要通过RBAC定义集群级别的角色来实现。
哪些情况下适合使用多个namesapce?
因为namespace可以提供独立的命名空间,因此可以实现部分的环境隔离。当你的项目和人员众多的时候可以考虑根据项目属性,例如生产、测试、开发划分不同的namespace。
Namespace使用,获取集群中有哪些namespace:
kubectl get ns
集群中默认会有default和kube-system这两个namespace。
在执行kubectl命令时可以使用-n指定操作的namespace。
用户的普通应用默认是在default下,与集群管理相关的为整个集群提供服务的应用一般部署在kube-system的namespace下,例如我们在安装kubernetes集群时部署的kubedns、heapseter、EFK等都是在这个namespace下面。
另外,并不是所有的资源对象都会对应namespace,node和persistentVolume就不属于任何namespace。
传统的 Jenkins Slave 一主多从方式会存在一些痛点?
正因为上面的这些种种痛点,我们渴望一种更高效更可靠的方式来完成这个 CI/CD 流程,而 Docker 虚拟化容器技术能很好的解决这个痛点,又特别是在 Kubernetes 集群环境下面能够更好来解决上面的问题,下图是基于 Kubernetes 搭建 Jenkins 集群的简单示意图:
从图上可以看到 Jenkins Master 和 Jenkins Slave 以 Pod 形式运行在 Kubernetes 集群的 Node 上,Master 运行在其中一个节点,并且将其配置数据存储到一个 Volume 上去,Slave 运行在各个节点上,并且它不是一直处于运行状态,它会按照需求动态的创建并自动删除。
这种方式的工作流程大致为:当 Jenkins Master 接受到 Build 请求时,会根据配置的 Label 动态创建一个运行在 Pod 中的 Jenkins Slave 并注册到 Master 上,当运行完 Job 后,这个 Slave 会被注销并且这个 Pod 也会自动删除,恢复到最初状态。
那么我们使用这种方式带来了哪些好处呢?
如何部署一个云原生的Kubernetes应用?
说明:
kubernetes应用构建和发布流程流程概括如下:
第一步,Pull 代码及全局环境变量申明
#!/bin/bash
# Filename: k8s-deploy_v0.2.sh
# Description: jenkins CI/CD 持续发布脚本
# Author: yi.hu
# Email: [email protected]
# Revision: 1.0
# Date: 2018-08-10
# Note: prd
# zookeeper基础服务,依照环境实际地址配置
init() {
local lowerEnv="$(echo ${AppEnv} | tr '[:upper:]' 'lower')"
case "${lowerEnv}" in
dev)
CFG_ADDR="10.34.11.186:4181"
DR_CFG_ZOOKEEPER_ENV_URL="10.34.11.186:4181"
;;
demo)
CFG_ADDR="10.34.11.186:4181"
DR_CFG_ZOOKEEPER_ENV_URL="10.34.11.186:4181"
;;
*)
echo "Not support AppEnv: ${AppEnv}"
exit 1
;;
esac
}
# 函数执行
init
# 初始化变量
AppId=$(echo ${AppOrg}_${AppEnv}_${AppName} |sed 's/[^a-zA-Z0-9_-]//g' | tr "[:lower:]" "[:upper:]")
CFG_LABEL=${CfgLabelBaseNode}/${AppId}
CFG_ADDR=${CFG_ADDR}
VERSION=$(echo "${GitBranch}" | sed 's@release/@@')
第二步,登录harbor 仓库
docker_login () {
docker login ${DOCKER_REGISTRY} -u${User} -p${PassWord}
}
第三步,编译代码,制作应用镜像,上传镜像到harbor仓库。
build() {
if [ "x${ACTION}" == "xDEPLOY" ] || [ "x${ACTION}" == "xPRE_DEPLOY" ]; then
echo "Test harbor registry: ${DOCKER_REGISTRY}"
curl --connect-timeout 30 -I ${DOCKER_REGISTRY}/api/projects 2>/dev/null | grep 'HTTP/1.1 200 OK' > /dev/null
echo "Check image EXIST or NOT: ${ToImage}"
ImageCheck_Harbor=$(echo ${ToImage} | sed 's/\([^/]\+\)\([^:]\+\):/\1\/api\/repositories\2\/tags\//')
Responed_Code=$(curl -u${User}:${PassWord} -so /dev/null -w '%{response_code}' ${ImageCheck_Harbor} || true)
echo ${Responed_Code}
if [ "${NoCache}" == "true" ] || [ "x${ResponedCode}" != "x200" ] ; then
if [ "x${ActionAfterBuild}" != "x" ]; then
eval ${ActionAfterBuild}
fi
echo "生成Dockerfile文件"
echo "FROM ${FromImage}" > Dockerfile
cat >> Dockerfile <<-EOF
${Dockerfile}
EOF
echo "同步上层镜像: ${FromImage}"
docker pull ${FromImage} # 同步上层镜像
echo "构建镜像,并Push到仓库: ${ToImage}"
docker build --no-cache=${NoCache} -t ${ToImage} . && docker push ${ToImage} || exit 1 # 开始构建镜像,成功后Push到仓库
echo "删除镜像: ${ToImage}"
docker rmi ${ToImage} || echo # 删除镜像
fi
fi
}
第四步,发布、预发布、停止、重启
deploy() {
if [ "x${ACTION}" == "xSTOP" ]; then
# 停止当前实例
kubectl delete -f ${AppName}-deploy.yaml
elif [ "x${ACTION}" == "xRESTART" ]; then
kubectl delete pod -n ${NameSpace} -l app=${AppName}
elif [ "x${ACTION}" == "xDEPLOY" ]; then
kubectl apply -f ${AppName}-deploy.yaml
fi
}
第五步,查看pod 是否正常启动,如果失败则返回1,进而会详细显示报错信息。
check_status() {
RETRY_COUNT=5
echo "检查 pod 运行状态"
while (( $RETRY_COUNT )); do
POD_STATUS=$(kubectl get pod -n ${NameSpace} -l app=${AppName} )
AVAILABLE_COUNT=$(kubectl get deploy -n ${NameSpace} -l app=${AppName} | awk '{print $(NF-1)}' | grep -v 'AVAILABLE')
if [ "X${AVAILABLE_COUNT}" != "X${Replicas}" ]; then
echo "[$(date '+%F %T')] Show pod Status , wait 30s and retry #$RETRY_COUNT "
echo "${POD_STATUS}"
let RETRY_COUNT-- || true
sleep 30
elif [ "X${AVAILABLE_COUNT}" == "X${Replicas}" ]; then
echo "Deploy Running successed"
break
else
echo "[$(date '+%F %T')] NOT expected pod status: "
echo "${POD_STATUS}"
return 1
fi
done
if [ "X${RETRY_COUNT}" == "X0" ]; then
echo "[$(date '+%F %T')] show describe pod status: "
echo -e "`kubectl describe pod -n ${NameSpace} -l app=${AppName}`"
fi
}
#主流程函数执行
docker_login
build
第六步, 更改 YAML 文件中参数
cat > ${WORKSPACE}/${AppName}-deploy.yaml <<- EOF
#####################################################
#
# ${ACTION} Deployment
#
#####################################################
apiVersion: apps/v1beta2 # for versions before 1.8.0 use apps/v1beta1
kind: Deployment
metadata:
name: ${AppName}
namespace: ${NameSpace}
labels:
app: ${AppName}
version: ${VERSION}
AppEnv: ${AppEnv}
spec:
replicas: ${Replicas}
selector:
matchLabels:
app: ${AppName}
template:
metadata:
labels:
app: ${AppName}
spec:
containers:
- name: ${AppName}
image: ${ToImage}
ports:
- containerPort: ${ContainerPort}
livenessProbe:
httpGet:
path: ${HealthCheckURL}
port: ${ContainerPort}
initialDelaySeconds: 90
timeoutSeconds: 5
periodSeconds: 5
readinessProbe:
httpGet:
path: ${HealthCheckURL}
port: ${ContainerPort}
initialDelaySeconds: 5
timeoutSeconds: 5
periodSeconds: 5
# configmap env
env:
- name: CFG_LABEL
value: ${CFG_LABEL}
- name: HTTP_SERVER
value: ${HTTP_SERVER}
- name: CFG_ADDR
value: ${CFG_ADDR}
- name: DR_CFG_ZOOKEEPER_ENV_URL
value: ${DR_CFG_ZOOKEEPER_ENV_URL}
- name: ENTRYPOINT
valueFrom:
configMapKeyRef:
name: ${ConfigMap}
key: ENTRYPOINT
- name: HTTP_TAR_FILES
valueFrom:
configMapKeyRef:
name: ${ConfigMap}
key: HTTP_TAR_FILES
- name: WITH_SGHUB_APM_AGENT
valueFrom:
configMapKeyRef:
name: ${ConfigMap}
key: WITH_SGHUB_APM_AGENT
- name: WITH_TINGYUN
valueFrom:
configMapKeyRef:
name: ${ConfigMap}
key: WITH_TINGYUN
- name: CFG_FILES
valueFrom:
configMapKeyRef:
name: ${ConfigMap}
key: CFG_FILES
# configMap volume
volumeMounts:
- name: applogs
mountPath: /volume_logs/
volumes:
- name: applogs
hostPath:
path: /opt/app_logs/${AppName}
imagePullSecrets:
- name: ${ImagePullSecrets}
---
apiVersion: v1
kind: Service
metadata:
name: ${AppName}
namespace: ${NameSpace}
labels:
app: ${AppName}
spec:
ports:
- port: ${ContainerPort}
targetPort: ${ContainerPort}
selector:
app: ${AppName}
EOF
第七步,创建configmap 环境变量
kubectl delete configmap ${ConfigMap} -n ${NameSpace}
kubectl create configmap ${ConfigMap} ${ConfigMapData} -n ${NameSpace}
# 执行部署
deploy
# 打印配置
cat ${WORKSPACE}/${AppName}-deploy.yaml
# 执行启动状态检查
check_status
给一个namespace 设置全局资源限制规则:
cat ftc_demo_limitRange.yaml
apiVersion: v1
kind: LimitRange
metadata:
name: ftc-demo-limit
namespace: ftc-demo
spec:
limits:
- max : # 在一个pod 中最高使用资源
cpu: "2"
memory: "6Gi"
min: # 在一个pod中最低使用资源
cpu: "100m"
memory: "2Gi"
type: Pod
- default: # 启动一个Container 默认资源规则
cpu: "1"
memory: "4Gi"
defaultRequest:
cpu: "200m"
memory: "2Gi"
max: #匹配用户手动指定Limits 规则
cpu: "1"
memory: "4Gi"
min:
cpu: "100m"
memory: "256Mi"
type: Container
限制规则说明:
默认创建一个pod 效果如下:
kubectl describe pod -l app=ftc-saas-service -n ftc-demo
Controlled By: ReplicaSet/ftc-saas-service-7f998899c5
Containers:
ftc-saas-service:
Container ID: docker://9f3f1a56e36e1955e6606971e43ee138adab1e2429acc4279d353dcae40e3052
Image: dl-harbor.dianrong.com/ftc/ftc-saas-service:6058a7fbc6126332a3dbb682337c0ac68abc555e
Image ID: docker-pullable://dl-harbor.dianrong.com/ftc/ftc-saas-service@sha256:a96fa52b691880e8b21fc32fff98d82156b11780c53218d22a973a4ae3d61dd7
Port: 8080/TCP
Host Port: 0/TCP
State: Running
Started: Tue, 21 Aug 2018 19:11:29 +0800
Ready: True
Restart Count: 0
Limits:
cpu: 1
memory: 4Gi
Requests:
cpu: 200m
memory: 2Gi
生产案列:
apiVersion: v1
kind: LimitRange
metadata:
name: jccfc-prod-limit
namespace: jccfc-prod
spec:
limits:
- max: # pod 中所有容器,Limit值和的上限,也是整个pod资源的最大Limit
cpu: "4000m"
memory: "8192Mi"
min: # pod 中所有容器,request的资源总和不能小于min中的值
cpu: "1000m"
memory: "2048Mi"
type: Pod
- default: # 默认的资源限制
cpu: "4000m"
memory: "4096Mi"
defaultRequest: # 默认的资源需求
cpu: "2000m"
memory: "4096Mi"
maxLimitRequestRatio:
cpu: 2
memory: 2
type: Container
为什么选择Prometheus?
Prometheus架构:
prometheus-operator集群:
1)Operator:固化到软件中的运维技能
SRE 是用开发软件的方式来进行应用运维的人。他们是工程师、开发者,通晓如何进行软件开发,尤其是特定应用域的开发。他们做出的东西,就是包含这一应用的运维领域技能的软件。
我们的团队正在 Kubernetes 社区进行一个概念的设计和实现,这一概念就是:在 Kubernetes 基础之上,可靠的创建、配置和管理复杂应用的方法。
我们把这种软件称为 Operator。一个 Operator 指的是一个面向特定应用的控制器,这一控制器对 Kubernetes API 进行了扩展,使用 Kubernetes 用户的行为方式,创建、配置和管理复杂的有状态应用的实例。他构建在基础的 Kubernetes 资源和控制器概念的基础上,但是包含了具体应用领域的运维知识,实现了日常任务的自动化。
2)kubernetes operator
在 Kubernetes 的支持下,管理和伸缩 Web 应用、移动应用后端以及 API 服务都变得比较简单了。其原因是这些应用一般都是无状态的,所以 Deployment 这样的基础 Kubernetes API 对象就可以在无需附加操作的情况下,对应用进行伸缩和故障恢复了。
而对于数据库、缓存或者监控系统等有状态应用的管理,就是个挑战了。这些系统需要应用领域的知识,来正确的进行伸缩和升级,当数据丢失或不可用的时候,要进行有效的重新配置。我们希望这些应用相关的运维技能可以编码到软件之中,从而借助 Kubernetes 的能力,正确的运行和管理复杂应用。
Operator 这种软件,使用 TPR(第三方资源,现在已经升级为 CRD) 机制对 Kubernetes API 进行扩展,将特定应用的知识融入其中,让用户可以创建、配置和管理应用。和 Kubernetes 的内置资源一样,Operator 操作的不是一个单实例应用,而是集群范围内的多实例。
3)Operator如何构建
Operator 基于两个 Kubernetes 的核心概念:资源和控制器。例如内置的 ReplicaSet 资源让用户能够设置指定数量的 Pod 来运行,Kubernetes 内置的控制器会通过创建或移除 Pod 的方式,来确保 ReplicaSet 资源的状态合乎期望。Kubernetes 中有很多基础的控制器和资源用这种方式进行工作,包括 Service,Deployment 以及 DaemonSet。
用户将一个 Pod 的 RS 扩展到三个。
一段时间之后,Kubernetes 的控制器按照用户意愿创建新的 Pod。
Operator 在基础的 Kubernetes 资源和控制器之上,加入了相关的知识和配置,让 Operator 能够执行特定软件的常用任务。例如当手动对 etcd 集群进行伸缩的时候,用户必须执行几个步骤:为新的 etcd 示例创建 DNS 名称,加载新的 etcd 示例,使用 etcd 管理工具(etcdctl member add)来告知现有集群加入新成员。etcd Operator 的用户就只需要简单的把 etcd 的集群规模字段加一而已。
用户使用 kubectl 触发了一次备份。
复杂的管理任务还有很多,包括应用升级、配置备份、原生 Kubernetes API 的服务发现,应用的 TLS 认证配置以及灾难恢复等。
创建集群角色用户prometheus-operator:
cat prometheus-operator-service-account.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus-operator
namespace: monitoring
创建prometheus-operator 集群角色:
cat prometheus-operator-cluster-role.yaml
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: prometheus-operator
rules:
- apiGroups:
- extensions
resources:
- thirdpartyresources
verbs:
- "*"
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
verbs:
- "*"
- apiGroups:
- monitoring.coreos.com
resources:
- alertmanagers
- prometheuses
- servicemonitors
verbs:
- "*"
- apiGroups:
- apps
resources:
- statefulsets
verbs: ["*"]
- apiGroups: [""]
resources:
- configmaps
- secrets
verbs: ["*"]
- apiGroups: [""]
resources:
- pods
verbs: ["list", "delete"]
- apiGroups: [""]
resources:
- services
- endpoints
verbs: ["get", "create", "update"]
- apiGroups: [""]
resources:
- nodes
verbs: ["list", "watch"]
- apiGroups: [""]
resources:
- namespaces
verbs: ["list"]
绑定集群角色:
cat prometheus-operator-cluster-role-binding.yaml
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: prometheus-operator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus-operator
subjects:
- kind: ServiceAccount
name: prometheus-operator
namespace: monitoring
部署prometheus-operator:
# cat prometheus-operator.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
k8s-app: prometheus-operator
name: prometheus-operator
namespace: monitoring
spec:
replicas: 1
template:
metadata:
labels:
k8s-app: prometheus-operator
spec:
containers:
- args:
- --kubelet-service=kube-system/kubelet
- --config-reloader-image=quay.io/coreos/configmap-reload:v0.0.1
image: quay.io/coreos/prometheus-operator:v0.15.0
name: prometheus-operator
ports:
- containerPort: 8080
name: http
resources:
limits:
cpu: 200m
memory: 100Mi
requests:
cpu: 100m
memory: 50Mi
serviceAccountName: prometheus-operator
部署prometheus-opeartor service:
cat prometheus-operator-service.yaml
apiVersion: v1
kind: Service
metadata:
name: prometheus-operator
namespace: monitoring
labels:
k8s-app: prometheus-operator
spec:
type: ClusterIP
ports:
- name: http
port: 8080
targetPort: http
protocol: TCP
selector:
k8s-app: prometheus-operator
部署 prometheus-operator-service-monitor:
cat prometheus-k8s-service-monitor-prometheus-operator.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: prometheus-operator
namespace: monitoring
labels:
k8s-app: prometheus-operator
spec:
endpoints:
- port: http
selector:
matchLabels:
k8s-app: prometheus-operator
创建monitoring namespaces:
apiVersion: v1
kind: Namespace
metadata:
name: monitoring
1)Prometheus RBAC 权限管理
创建prometheus-k8s 角色账号:
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus-k8s
namespace: monitoring
在kube-system与monitoring namespaces空间,创建 prometheus-k8s 角色用户权限:
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: Role
metadata:
name: prometheus-k8s
namespace: monitoring
rules:
- apiGroups: [""]
resources:
- nodes
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources:
- configmaps
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: Role
metadata:
name: prometheus-k8s
namespace: kube-system
rules:
- apiGroups: [""]
resources:
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: Role
metadata:
name: prometheus-k8s
namespace: default
rules:
- apiGroups: [""]
resources:
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: prometheus-k8s
rules:
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
绑定用户Role:
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
name: prometheus-k8s
namespace: default
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: prometheus-k8s
subjects:
- kind: ServiceAccount
name: prometheus-k8s
namespace: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
name: prometheus-k8s
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: prometheus-k8s
subjects:
- kind: ServiceAccount
name: prometheus-k8s
namespace: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
name: prometheus-k8s
namespace: monitoring
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: prometheus-k8s
subjects:
- kind: ServiceAccount
name: prometheus-k8s
namespace: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: prometheus-k8s
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus-k8s
subjects:
- kind: ServiceAccount
name: prometheus-k8s
namespace: monitoring
2)使用statefulset方式部署prometheus
kind: Secret
apiVersion: v1
data:
key: QVFCZU54dFlkMVNvRUJBQUlMTUVXMldSS29mdWhlamNKaC8yRXc9PQ==
metadata:
name: ceph-secret
namespace: monitoring
type: kubernetes.io/rbd
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: prometheus-ceph-fast
namespace: monitoring
provisioner: ceph.com/rbd
parameters:
monitors: 10.18.19.91:6789
adminId: admin
adminSecretName: ceph-secret
adminSecretNamespace: monitoring
userSecretName: ceph-secret
pool: prometheus-dev
userId: admin
---
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: k8s
namespace: monitoring
labels:
prometheus: k8s
spec:
replicas: 2
version: v2.0.0
serviceAccountName: prometheus-k8s
serviceMonitorSelector:
matchExpressions:
- {key: k8s-app, operator: Exists}
ruleSelector:
matchLabels:
role: prometheus-rulefiles
prometheus: k8s
resources:
requests:
memory: 4G
storage:
volumeClaimTemplate:
metadata:
name: prometheus-data
annotations:
volume.beta.kubernetes.io/storage-class: prometheus-ceph-fast
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 50Gi
alerting:
alertmanagers:
- namespace: monitoring
name: alertmanager-main
port: web
3)使用ceph RBD 作为prometheus 持久化存储。(此处有坑)
本次环境使用容器化方式部署api-server 。因api-server 默认镜像没有ceph 驱动,会导致pod在挂载存储步奏启动失败。
使用provisioner方式给apis-server提供ceph 驱动:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: rbd-provisioner
namespace: monitoring
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
app: rbd-provisioner
spec:
containers:
- name: rbd-provisioner
image: "quay.io/external_storage/rbd-provisioner:latest"
env:
- name: PROVISIONER_NAME
value: ceph.com/rbd
args: ["-master=http://10.18.19.143:8080", "-id=rbd-provisioner"]
4)创建prometheus svc
apiVersion: v1
kind: Service
metadata:
labels:
prometheus: k8s
name: prometheus-k8s
namespace: monitoring
spec:
type: NodePort
ports:
- name: web
nodePort: 30900
port: 9090
protocol: TCP
targetPort: web
selector:
prometheus: k8s
5)使用configmap创建prometheus报警规则文件
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-k8s-rules
namespace: monitoring
labels:
role: prometheus-rulefiles
prometheus: k8s
data:
alertmanager.rules.yaml: |+
groups:
- name: alertmanager.rules
rules:
- alert: AlertmanagerConfigInconsistent
expr: count_values("config_hash", alertmanager_config_hash) BY (service) / ON(service)
GROUP_LEFT() label_replace(prometheus_operator_alertmanager_spec_replicas, "service",
"alertmanager-$1", "alertmanager", "(.*)") != 1
for: 5m
labels:
severity: critical
annotations:
description: The configuration of the instances of the Alertmanager cluster
`{{$labels.service}}` are out of sync.
- alert: AlertmanagerDownOrMissing
expr: label_replace(prometheus_operator_alertmanager_spec_replicas, "job", "alertmanager-$1",
"alertmanager", "(.*)") / ON(job) GROUP_RIGHT() sum(up) BY (job) != 1
for: 5m
labels:
severity: warning
annotations:
description: An unexpected number of Alertmanagers are scraped or Alertmanagers
disappeared from discovery.
- alert: AlertmanagerFailedReload
expr: alertmanager_config_last_reload_successful == 0
for: 10m
labels:
severity: warning
annotations:
description: Reloading Alertmanager's configuration has failed for {{ $labels.namespace
}}/{{ $labels.pod}}.
etcd3.rules.yaml: |+
groups:
- name: ./etcd3.rules
rules:
- alert: InsufficientMembers
expr: count(up{job="etcd"} == 0) > (count(up{job="etcd"}) / 2 - 1)
for: 3m
labels:
severity: critical
annotations:
description: If one more etcd member goes down the cluster will be unavailable
summary: etcd cluster insufficient members
- alert: NoLeader
expr: etcd_server_has_leader{job="etcd"} == 0
for: 1m
labels:
severity: critical
annotations:
description: etcd member {{ $labels.instance }} has no leader
summary: etcd member has no leader
- alert: HighNumberOfLeaderChanges
expr: increase(etcd_server_leader_changes_seen_total{job="etcd"}[1h]) > 3
labels:
severity: warning
annotations:
description: etcd instance {{ $labels.instance }} has seen {{ $value }} leader
changes within the last hour
summary: a high number of leader changes within the etcd cluster are happening
- alert: HighNumberOfFailedGRPCRequests
expr: sum(rate(etcd_grpc_requests_failed_total{job="etcd"}[5m])) BY (grpc_method)
/ sum(rate(etcd_grpc_total{job="etcd"}[5m])) BY (grpc_method) > 0.01
for: 10m
labels:
severity: warning
annotations:
description: '{{ $value }}% of requests for {{ $labels.grpc_method }} failed
on etcd instance {{ $labels.instance }}'
summary: a high number of gRPC requests are failing
- alert: HighNumberOfFailedGRPCRequests
expr: sum(rate(etcd_grpc_requests_failed_total{job="etcd"}[5m])) BY (grpc_method)
/ sum(rate(etcd_grpc_total{job="etcd"}[5m])) BY (grpc_method) > 0.05
for: 5m
labels:
severity: critical
annotations:
description: '{{ $value }}% of requests for {{ $labels.grpc_method }} failed
on etcd instance {{ $labels.instance }}'
summary: a high number of gRPC requests are failing
- alert: GRPCRequestsSlow
expr: histogram_quantile(0.99, rate(etcd_grpc_unary_requests_duration_seconds_bucket[5m]))
> 0.15
for: 10m
labels:
severity: critical
annotations:
description: on etcd instance {{ $labels.instance }} gRPC requests to {{ $labels.grpc_method
}} are slow
summary: slow gRPC requests
- alert: HighNumberOfFailedHTTPRequests
expr: sum(rate(etcd_http_failed_total{job="etcd"}[5m])) BY (method) / sum(rate(etcd_http_received_total{job="etcd"}[5m]))
BY (method) > 0.01
for: 10m
labels:
severity: warning
annotations:
description: '{{ $value }}% of requests for {{ $labels.method }} failed on etcd
instance {{ $labels.instance }}'
summary: a high number of HTTP requests are failing
- alert: HighNumberOfFailedHTTPRequests
expr: sum(rate(etcd_http_failed_total{job="etcd"}[5m])) BY (method) / sum(rate(etcd_http_received_total{job="etcd"}[5m]))
BY (method) > 0.05
for: 5m
labels:
severity: critical
annotations:
description: '{{ $value }}% of requests for {{ $labels.method }} failed on etcd
instance {{ $labels.instance }}'
summary: a high number of HTTP requests are failing
- alert: HTTPRequestsSlow
expr: histogram_quantile(0.99, rate(etcd_http_successful_duration_seconds_bucket[5m]))
> 0.15
for: 10m
labels:
severity: warning
annotations:
description: on etcd instance {{ $labels.instance }} HTTP requests to {{ $labels.method
}} are slow
summary: slow HTTP requests
- alert: EtcdMemberCommunicationSlow
expr: histogram_quantile(0.99, rate(etcd_network_member_round_trip_time_seconds_bucket[5m]))
> 0.15
for: 10m
labels:
severity: warning
annotations:
description: etcd instance {{ $labels.instance }} member communication with
{{ $labels.To }} is slow
summary: etcd member communication is slow
- alert: HighNumberOfFailedProposals
expr: increase(etcd_server_proposals_failed_total{job="etcd"}[1h]) > 5
labels:
severity: warning
annotations:
description: etcd instance {{ $labels.instance }} has seen {{ $value }} proposal
failures within the last hour
summary: a high number of proposals within the etcd cluster are failing
- alert: HighFsyncDurations
expr: histogram_quantile(0.99, rate(etcd_disk_wal_fsync_duration_seconds_bucket[5m]))
> 0.5
for: 10m
labels:
severity: warning
annotations:
description: etcd instance {{ $labels.instance }} fync durations are high
summary: high fsync durations
- alert: HighCommitDurations
expr: histogram_quantile(0.99, rate(etcd_disk_backend_commit_duration_seconds_bucket[5m]))
> 0.25
for: 10m
labels:
severity: warning
annotations:
description: etcd instance {{ $labels.instance }} commit durations are high
summary: high commit durations
general.rules.yaml: |+
groups:
- name: general.rules
rules:
- alert: TargetDown
expr: 100 * (count(up == 0) BY (job) / count(up) BY (job)) > 10
for: 10m
labels:
severity: warning
annotations:
description: '{{ $value }}% of {{ $labels.job }} targets are down.'
summary: Targets are down
- alert: DeadMansSwitch
expr: vector(1)
labels:
severity: none
annotations:
description: This is a DeadMansSwitch meant to ensure that the entire Alerting
pipeline is functional.
summary: Alerting DeadMansSwitch
- record: fd_utilization
expr: process_open_fds / process_max_fds
- alert: FdExhaustionClose
expr: predict_linear(fd_utilization[1h], 3600 * 4) > 1
for: 10m
labels:
severity: warning
annotations:
description: '{{ $labels.job }}: {{ $labels.namespace }}/{{ $labels.pod }} instance
will exhaust in file/socket descriptors within the next 4 hours'
summary: file descriptors soon exhausted
- alert: FdExhaustionClose
expr: predict_linear(fd_utilization[10m], 3600) > 1
for: 10m
labels:
severity: critical
annotations:
description: '{{ $labels.job }}: {{ $labels.namespace }}/{{ $labels.pod }} instance
will exhaust in file/socket descriptors within the next hour'
summary: file descriptors soon exhausted
kube-controller-manager.rules.yaml: |+
groups:
- name: kube-controller-manager.rules
rules:
- alert: K8SControllerManagerDown
expr: absent(up{job="kube-controller-manager"} == 1)
for: 5m
labels:
severity: critical
annotations:
description: There is no running K8S controller manager. Deployments and replication
controllers are not making progress.
runbook: https://coreos.com/tectonic/docs/latest/troubleshooting/controller-recovery.html#recovering-a-controller-manager
summary: Controller manager is down
kube-scheduler.rules.yaml: |+
groups:
- name: kube-scheduler.rules
rules:
- record: cluster:scheduler_e2e_scheduling_latency_seconds:quantile
expr: histogram_quantile(0.99, sum(scheduler_e2e_scheduling_latency_microseconds_bucket)
BY (le, cluster)) / 1e+06
labels:
quantile: "0.99"
- record: cluster:scheduler_e2e_scheduling_latency_seconds:quantile
expr: histogram_quantile(0.9, sum(scheduler_e2e_scheduling_latency_microseconds_bucket)
BY (le, cluster)) / 1e+06
labels:
quantile: "0.9"
- record: cluster:scheduler_e2e_scheduling_latency_seconds:quantile
expr: histogram_quantile(0.5, sum(scheduler_e2e_scheduling_latency_microseconds_bucket)
BY (le, cluster)) / 1e+06
labels:
quantile: "0.5"
- record: cluster:scheduler_scheduling_algorithm_latency_seconds:quantile
expr: histogram_quantile(0.99, sum(scheduler_scheduling_algorithm_latency_microseconds_bucket)
BY (le, cluster)) / 1e+06
labels:
quantile: "0.99"
- record: cluster:scheduler_scheduling_algorithm_latency_seconds:quantile
expr: histogram_quantile(0.9, sum(scheduler_scheduling_algorithm_latency_microseconds_bucket)
BY (le, cluster)) / 1e+06
labels:
quantile: "0.9"
- record: cluster:scheduler_scheduling_algorithm_latency_seconds:quantile
expr: histogram_quantile(0.5, sum(scheduler_scheduling_algorithm_latency_microseconds_bucket)
BY (le, cluster)) / 1e+06
labels:
quantile: "0.5"
- record: cluster:scheduler_binding_latency_seconds:quantile
expr: histogram_quantile(0.99, sum(scheduler_binding_latency_microseconds_bucket)
BY (le, cluster)) / 1e+06
labels:
quantile: "0.99"
- record: cluster:scheduler_binding_latency_seconds:quantile
expr: histogram_quantile(0.9, sum(scheduler_binding_latency_microseconds_bucket)
BY (le, cluster)) / 1e+06
labels:
quantile: "0.9"
- record: cluster:scheduler_binding_latency_seconds:quantile
expr: histogram_quantile(0.5, sum(scheduler_binding_latency_microseconds_bucket)
BY (le, cluster)) / 1e+06
labels:
quantile: "0.5"
- alert: K8SSchedulerDown
expr: absent(up{job="kube-scheduler"} == 1)
for: 5m
labels:
severity: critical
annotations:
description: There is no running K8S scheduler. New pods are not being assigned
to nodes.
runbook: https://coreos.com/tectonic/docs/latest/troubleshooting/controller-recovery.html#recovering-a-scheduler
summary: Scheduler is down
kube-state-metrics.rules.yaml: |+
groups:
- name: kube-state-metrics.rules
rules:
- alert: DeploymentGenerationMismatch
expr: kube_deployment_status_observed_generation != kube_deployment_metadata_generation
for: 15m
labels:
severity: warning
annotations:
description: Observed deployment generation does not match expected one for
deployment {{$labels.namespaces}}{{$labels.deployment}}
- alert: DeploymentReplicasNotUpdated
expr: ((kube_deployment_status_replicas_updated != kube_deployment_spec_replicas)
or (kube_deployment_status_replicas_available != kube_deployment_spec_replicas))
unless (kube_deployment_spec_paused == 1)
for: 15m
labels:
severity: warning
annotations:
description: Replicas are not updated and available for deployment {{$labels.namespaces}}/{{$labels.deployment}}
- alert: DaemonSetRolloutStuck
expr: kube_daemonset_status_current_number_ready / kube_daemonset_status_desired_number_scheduled
* 100 < 100
for: 15m
labels:
severity: warning
annotations:
description: Only {{$value}}% of desired pods scheduled and ready for daemon
set {{$labels.namespaces}}/{{$labels.daemonset}}
- alert: K8SDaemonSetsNotScheduled
expr: kube_daemonset_status_desired_number_scheduled - kube_daemonset_status_current_number_scheduled
> 0
for: 10m
labels:
severity: warning
annotations:
description: A number of daemonsets are not scheduled.
summary: Daemonsets are not scheduled correctly
- alert: DaemonSetsMissScheduled
expr: kube_daemonset_status_number_misscheduled > 0
for: 10m
labels:
severity: warning
annotations:
description: A number of daemonsets are running where they are not supposed
to run.
summary: Daemonsets are not scheduled correctly
- alert: PodFrequentlyRestarting
expr: increase(kube_pod_container_status_restarts[1h]) > 5
for: 10m
labels:
severity: warning
annotations:
description: Pod {{$labels.namespaces}}/{{$labels.pod}} is was restarted {{$value}}
times within the last hour
kubelet.rules.yaml: |+
groups:
- name: kubelet.rules
rules:
- alert: K8SNodeNotReady
expr: kube_node_status_condition{condition="Ready",status="true"} == 0
for: 1h
labels:
severity: warning
annotations:
description: The Kubelet on {{ $labels.node }} has not checked in with the API,
or has set itself to NotReady, for more than an hour
summary: Node status is NotReady
- alert: K8SManyNodesNotReady
expr: count(kube_node_status_condition{condition="Ready",status="true"} == 0)
> 1 and (count(kube_node_status_condition{condition="Ready",status="true"} ==
0) / count(kube_node_status_condition{condition="Ready",status="true"})) > 0.2
for: 1m
labels:
severity: critical
annotations:
description: '{{ $value }}% of Kubernetes nodes are not ready'
- alert: K8SKubeletDown
expr: count(up{job="kubelet"} == 0) / count(up{job="kubelet"}) * 100 > 3
for: 1h
labels:
severity: warning
annotations:
description: Prometheus failed to scrape {{ $value }}% of kubelets.
- alert: K8SKubeletDown
expr: (absent(up{job="kubelet"} == 1) or count(up{job="kubelet"} == 0) / count(up{job="kubelet"}))
* 100 > 1
for: 1h
labels:
severity: critical
annotations:
description: Prometheus failed to scrape {{ $value }}% of kubelets, or all Kubelets
have disappeared from service discovery.
summary: Many Kubelets cannot be scraped
- alert: K8SKubeletTooManyPods
expr: kubelet_running_pod_count > 100
for: 10m
labels:
severity: warning
annotations:
description: Kubelet {{$labels.instance}} is running {{$value}} pods, close
to the limit of 110
summary: Kubelet is close to pod limit
kubernetes.rules.yaml: |+
groups:
- name: kubernetes.rules
rules:
- record: pod_name:container_memory_usage_bytes:sum
expr: sum(container_memory_usage_bytes{container_name!="POD",pod_name!=""}) BY
(pod_name)
- record: pod_name:container_spec_cpu_shares:sum
expr: sum(container_spec_cpu_shares{container_name!="POD",pod_name!=""}) BY (pod_name)
- record: pod_name:container_cpu_usage:sum
expr: sum(rate(container_cpu_usage_seconds_total{container_name!="POD",pod_name!=""}[5m]))
BY (pod_name)
- record: pod_name:container_fs_usage_bytes:sum
expr: sum(container_fs_usage_bytes{container_name!="POD",pod_name!=""}) BY (pod_name)
- record: namespace:container_memory_usage_bytes:sum
expr: sum(container_memory_usage_bytes{container_name!=""}) BY (namespace)
- record: namespace:container_spec_cpu_shares:sum
expr: sum(container_spec_cpu_shares{container_name!=""}) BY (namespace)
- record: namespace:container_cpu_usage:sum
expr: sum(rate(container_cpu_usage_seconds_total{container_name!="POD"}[5m]))
BY (namespace)
- record: cluster:memory_usage:ratio
expr: sum(container_memory_usage_bytes{container_name!="POD",pod_name!=""}) BY
(cluster) / sum(machine_memory_bytes) BY (cluster)
- record: cluster:container_spec_cpu_shares:ratio
expr: sum(container_spec_cpu_shares{container_name!="POD",pod_name!=""}) / 1000
/ sum(machine_cpu_cores)
- record: cluster:container_cpu_usage:ratio
expr: rate(container_cpu_usage_seconds_total{container_name!="POD",pod_name!=""}[5m])
/ sum(machine_cpu_cores)
- record: apiserver_latency_seconds:quantile
expr: histogram_quantile(0.99, rate(apiserver_request_latencies_bucket[5m])) /
1e+06
labels:
quantile: "0.99"
- record: apiserver_latency:quantile_seconds
expr: histogram_quantile(0.9, rate(apiserver_request_latencies_bucket[5m])) /
1e+06
labels:
quantile: "0.9"
- record: apiserver_latency_seconds:quantile
expr: histogram_quantile(0.5, rate(apiserver_request_latencies_bucket[5m])) /
1e+06
labels:
quantile: "0.5"
- alert: APIServerLatencyHigh
expr: apiserver_latency_seconds:quantile{quantile="0.99",subresource!="log",verb!~"^(?:WATCH|WATCHLIST|PROXY|CONNECT)$"}
> 1
for: 10m
labels:
severity: warning
annotations:
description: the API server has a 99th percentile latency of {{ $value }} seconds
for {{$labels.verb}} {{$labels.resource}}
- alert: APIServerLatencyHigh
expr: apiserver_latency_seconds:quantile{quantile="0.99",subresource!="log",verb!~"^(?:WATCH|WATCHLIST|PROXY|CONNECT)$"}
> 4
for: 10m
labels:
severity: critical
annotations:
description: the API server has a 99th percentile latency of {{ $value }} seconds
for {{$labels.verb}} {{$labels.resource}}
- alert: APIServerErrorsHigh
expr: rate(apiserver_request_count{code=~"^(?:5..)$"}[5m]) / rate(apiserver_request_count[5m])
* 100 > 2
for: 10m
labels:
severity: warning
annotations:
description: API server returns errors for {{ $value }}% of requests
- alert: APIServerErrorsHigh
expr: rate(apiserver_request_count{code=~"^(?:5..)$"}[5m]) / rate(apiserver_request_count[5m])
* 100 > 5
for: 10m
labels:
severity: critical
annotations:
description: API server returns errors for {{ $value }}% of requests
- alert: K8SApiserverDown
expr: absent(up{job="apiserver"} == 1)
for: 20m
labels:
severity: critical
annotations:
description: No API servers are reachable or all have disappeared from service
discovery
node.rules.yaml: |+
groups:
- name: node.rules
rules:
- record: instance:node_cpu:rate:sum
expr: sum(rate(node_cpu{mode!="idle",mode!="iowait",mode!~"^(?:guest.*)$"}[3m]))
BY (instance)
- record: instance:node_filesystem_usage:sum
expr: sum((node_filesystem_size{mountpoint="/"} - node_filesystem_free{mountpoint="/"}))
BY (instance)
- record: instance:node_network_receive_bytes:rate:sum
expr: sum(rate(node_network_receive_bytes[3m])) BY (instance)
- record: instance:node_network_transmit_bytes:rate:sum
expr: sum(rate(node_network_transmit_bytes[3m])) BY (instance)
- record: instance:node_cpu:ratio
expr: sum(rate(node_cpu{mode!="idle"}[5m])) WITHOUT (cpu, mode) / ON(instance)
GROUP_LEFT() count(sum(node_cpu) BY (instance, cpu)) BY (instance)
- record: cluster:node_cpu:sum_rate5m
expr: sum(rate(node_cpu{mode!="idle"}[5m]))
- record: cluster:node_cpu:ratio
expr: cluster:node_cpu:rate5m / count(sum(node_cpu) BY (instance, cpu))
- alert: NodeExporterDown
expr: absent(up{job="node-exporter"} == 1)
for: 10m
labels:
severity: warning
annotations:
description: Prometheus could not scrape a node-exporter for more than 10m,
or node-exporters have disappeared from discovery
- alert: NodeDiskRunningFull
expr: predict_linear(node_filesystem_free[6h], 3600 * 24) < 0
for: 30m
labels:
severity: warning
annotations:
description: device {{$labels.device}} on node {{$labels.instance}} is running
full within the next 24 hours (mounted at {{$labels.mountpoint}})
- alert: NodeDiskRunningFull
expr: predict_linear(node_filesystem_free[30m], 3600 * 2) < 0
for: 10m
labels:
severity: critical
annotations:
description: device {{$labels.device}} on node {{$labels.instance}} is running
full within the next 2 hours (mounted at {{$labels.mountpoint}})
- alert: NodeCPUUsage
expr: (100 - (avg by (instance) (irate(node_cpu{job="node-exporter",mode="idle"}[5m])) * 100)) > 5
for: 10m
labels:
severity: critical
annotations:
# description: {{$labels.instance}} CPU usage is above 75% (current value is {{ $value }})
prometheus.rules.yaml: |+
groups:
- name: prometheus.rules
rules:
- alert: PrometheusConfigReloadFailed
expr: prometheus_config_last_reload_successful == 0
for: 10m
labels:
severity: warning
annotations:
description: Reloading Prometheus' configuration has failed for {{$labels.namespace}}/{{$labels.pod}}
- alert: PrometheusNotificationQueueRunningFull
expr: predict_linear(prometheus_notifications_queue_length[5m], 60 * 30) > prometheus_notifications_queue_capacity
for: 10m
labels:
severity: warning
annotations:
description: Prometheus' alert notification queue is running full for {{$labels.namespace}}/{{
$labels.pod}}
- alert: PrometheusErrorSendingAlerts
expr: rate(prometheus_notifications_errors_total[5m]) / rate(prometheus_notifications_sent_total[5m])
> 0.01
for: 10m
labels:
severity: warning
annotations:
description: Errors while sending alerts from Prometheus {{$labels.namespace}}/{{
$labels.pod}} to Alertmanager {{$labels.Alertmanager}}
- alert: PrometheusErrorSendingAlerts
expr: rate(prometheus_notifications_errors_total[5m]) / rate(prometheus_notifications_sent_total[5m])
> 0.03
for: 10m
labels:
severity: critical
annotations:
description: Errors while sending alerts from Prometheus {{$labels.namespace}}/{{
$labels.pod}} to Alertmanager {{$labels.Alertmanager}}
- alert: PrometheusNotConnectedToAlertmanagers
expr: prometheus_notifications_alertmanagers_discovered < 1
for: 10m
labels:
severity: warning
annotations:
description: Prometheus {{ $labels.namespace }}/{{ $labels.pod}} is not connected
to any Alertmanagers
noah_pod.rules.yaml: |+
groups:
- name: noah_pod.rules
rules:
- alert: Pod_all_cpu_usage
expr: (sum by(name)(rate(container_cpu_usage_seconds_total{image!=""}[5m]))*100) > 10
for: 5m
labels:
severity: critical
service: pods
annotations:
description: 容器 {{ $labels.name }} CPU 资源利用率大于 75% , (current value is {{ $value }})
summary: Dev CPU 负载告警
- alert: Pod_all_memory_usage
expr: sort_desc(avg by(name)(irate(container_memory_usage_bytes{name!=""}[5m]))*100) > 1024*10^3*2
for: 10m
labels:
severity: critical
annotations:
description: 容器 {{ $labels.name }} Memory 资源利用率大于 2G , (current value is {{ $value }})
summary: Dev Memory 负载告警
- alert: Pod_all_network_receive_usage
expr: sum by (name)(irate(container_network_receive_bytes_total{container_name="POD"}[1m])) > 1024*1024*50
for: 10m
labels:
severity: critical
annotations:
description: 容器 {{ $labels.name }} network_receive 资源利用率大于 50M , (current value is {{ $value }})
summary: network_receive 负载告警
kube-state-metricsgithub 项目地址
查看kube-state配置文件:
# tree kube-state-metrics/
kube-state-metrics/
├── kube-state-metrics-cluster-role-binding.yaml
├── kube-state-metrics-cluster-role.yaml
├── kube-state-metrics-deployment.yaml
├── kube-state-metrics-role-binding.yaml
├── kube-state-metrics-role.yaml
├── kube-state-metrics-service-account.yaml
├── kube-state-metrics-service.yaml
└── prometheus-k8s-service-monitor-kube-state-metrics.yaml
0 directories, 8 files
执行创建 kube-state-metrics:
kubectl apply -f kube-state-metrics/
检查服务启动状态:
[root@app553 ~]# kubectl get pod,svc -n monitoring
NAME READY STATUS RESTARTS AGE
po/kube-state-metrics-678d94f978-nv4g5 2/2 Running 0 52m
po/prometheus-k8s-0 2/2 Running 0 4d
po/prometheus-k8s-1 2/2 Running 0 4d
po/prometheus-operator-68589bfbfd-pf4tk 1/1 Running 0 4d
po/rbd-provisioner-698ddf4568-w7s29 1/1 Running 0 4d
检查kube-state-metrics在prometheus 启动状态:
node-export 主要主要是监控kubernetes 集群node 物理主机:cpu、memory、network、disk 等基础监控资源。
使用daemonset 方式 自动为每个node部署监控agent:
# tree node-exporter/
node-exporter/
├── node-exporter-daemonset.yaml
├── node-exporter-service.yaml
└── prometheus-k8s-service-monitor-node-exporter.yaml
创建node-export 服务:
kubectl apply -f node-exporter/
查看 pod 是状态:
# kubectl get pod -n monitoring
NAME READY STATUS RESTARTS AGE
kube-state-metrics-678d94f978-nv4g5 2/2 Running 0 19h
node-exporter-xm9cl 1/1 Running 0 1m
prometheus-k8s-0 2/2 Running 0 18h
prometheus-k8s-1 2/2 Running 0 18h
prometheus-operator-68589bfbfd-pq4d4 1/1 Running 0 18h
rbd-provisioner-698ddf4568-d57l2 1/1 Running 0 18h
查看prometheus target状态:
修改api-server、kube-controller-manager、kube-scheduler、kubelets、kube-router配置文件监听地址,添加如下选项,使其能访问 metrics。
api-server 添加选项:
--insecure-port=8080
--insecure-bind-address=0.0.0.0
kube-controller-manager:
--bind-address=0.0.0.0
--secure-port=10257
kube-scheduler:
kubelet 添加endpoint 配置prometheus taget 访问权限。
--address=0.0.0.0
--authentication-token-webhook=true
--authorization-mode=Webhook
kube-router配置metrics:
参考地址
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "10258"
添加启动选项:
---master=10.18.19.98:8080
--metrics-port=10258
检查k8s 核心组件监听地址:
netstat -tlunp | grep kube
tcp 0 0 0.0.0.0:50051 0.0.0.0:* LISTEN 86466/kube-router
tcp 0 0 127.0.0.1:10248 0.0.0.0:* LISTEN 108134/kubelet
tcp 0 0 0.0.0.0:10250 0.0.0.0:* LISTEN 108134/kubelet
tcp 0 0 0.0.0.0:6443 0.0.0.0:* LISTEN 175680/kube-apiserv
tcp 0 0 0.0.0.0:10252 0.0.0.0:* LISTEN 168570/kube-control
tcp 0 0 0.0.0.0:10255 0.0.0.0:* LISTEN 108134/kubelet
tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN 175680/kube-apiserv
tcp 0 0 0.0.0.0:10258 0.0.0.0:* LISTEN 86466/kube-router
tcp 0 0 0.0.0.0:179 0.0.0.0:* LISTEN 86466/kube-router
tcp 0 0 0.0.0.0:20244 0.0.0.0:* LISTEN 86466/kube-router
tcp 0 0 0.0.0.0:4194 0.0.0.0:* LISTEN 108134/kubelet
访问各个组件 metrics:
curl http://10.18.19.98:8080/metrics #kube-apiser
curl http://10.18.19.98:10251/metrics #kube-schedul
curl http://10.18.19.98:10252/metrics #kube-controlle
curl http://10.18.19.98:10255/metrics #kubelet
curl http://10.18.19.98:10258/metrics #kube-router
创建k8s 组件servermonitor服务:
tree exporter-kube-apiserver/ exporter-kube-controller-manager/ exporter-kube-scheduler/ exporter-kube-router/ exporter-kubelets/
exporter-kube-apiserver/
└── prometheus-k8s-service-monitor-apiserver.yaml
exporter-kube-controller-manager/
├── kube-controller-manager-svc.yaml
└── prometheus-k8s-service-monitor-kube-controller-manager.yaml
exporter-kube-scheduler/
├── kube-scheduler-svc.yaml
└── prometheus-k8s-service-monitor-kube-scheduler.yaml
exporter-kube-router/
├── kube-router-svc.yaml
└── prometheus-k8s-service-monitor-kube-scheduler.yaml
exporter-kubelets/
└── prometheus-k8s-service-monitor-kubelet.yaml
kubectl apply -f exporter-kube-apiserver/
注意 api-server 服务 namespace 在default使用默认svc kubernetes。其余组件服务在kube-system 空间 ,需要单独创建svc。
prometheus target检查组件状态:
查看api-server servicemonitor 资源:
kubectl get servicemonitors -n monitoring
NAME AGE
kube-apiserver 18m
kube-controller-manager 18m
kube-router 18m
kube-scheduler 18m
kube-state-metrics 1d
kubelet 18m
node-exporter 7h
prometheus 1d
prometheus-operator 1d
prometheus从根本上存储的所有数据都是时间序列: 具有时间戳的数据流只属于单个度量指标和该度量指标下的多个标签维度。除了存储时间序列数据外,Prometheus也可以利用查询表达式存储5分钟的返回结果中的时间序列数据。
Prometheus查询:
Prometheus提供一个函数式的表达式语言,可以使用户实时地查找和聚合时间序列数据。表达式计算结果可以在图表中展示,也可以在Prometheus表达式浏览器中以表格形式展示,或者作为数据源, 以HTTP API的方式提供给外部系统使用。
表达式语言数据类型:
在Prometheus的表达式语言中,任何表达式或者子表达式都可以归为四种类型:
examples:
查询K8S集群内所有apiserver健康状态
(sum(up{job="apiserver"} == 1) / count(up{job="apiserver"})) * 100
查询pod 聚合一分钟之内的cpu 负载
sum by (container_name)(rate(container_cpu_usage_seconds_total{image!="",container_name!="POD",pod_name="acw62egvxd95l7t3q5uxee"}[1m]))
时间序列选择器:
即时向量选择器允许选择一组时间序列,或者某个给定的时间戳的样本数据。下面这个例子选择了具有container_cpu_usage_seconds_total的时间序列:
container_cpu_usage_seconds_total
你可以通过附加一组标签,并用{}括起来,来进一步筛选这些时间序列。下面这个例子只选择有container_cpu_usage_seconds_total名称的、有prometheus工作标签的、有pod_name组标签的时间序列:
container_cpu_usage_seconds_total{image!="",container_name!="POD",pod_name="acw62egvxd95l7t3q5uxee"}
另外,也可以也可以将标签值反向匹配,或者对正则表达式匹配标签值。下面列举匹配操作符:
=:选择正好相等的字符串标签
!=:选择不相等的字符串标签
=~:选择匹配正则表达式的标签(或子标签)
!=:选择不匹配正则表达式的标签(或子标签)
例如,选择staging、testing、development环境下的,GET之外的HTTP方法的http_requests_total的时间序列:
http_requests_total{environment=~"staging|testing|development",method!="GET"}
范围向量表达式正如即时向量表达式一样运行,前者返回从当前时刻的时间序列回来。语法是,在一个向量表达式之后添加[]
来表示时间范围,持续时间用数字表示,后接下面单元之一:
时间长度有一个数值决定,后面可以跟下面的单位:
s
- secondsm
- minutesh
- hoursd
- daysw
- weeksy
- years在下面这个例子中,我们选择此刻开始1分钟内的所有记录,metric名称为container_cpu_usage_seconds_total、作业标签为pod_name的时间序列的所有值:
irate(container_cpu_usage_seconds_total{image!="",container_name!="POD",pod_name="acw62egvxd95l7t3q5uxee"}[1m])
偏移修饰符允许更改查询中单个即时向量和范围向量的时间偏移量,例如,以下表达式返回相对于当前查询时间5分钟前的container_cpu_usage_seconds_total值:
container_cpu_usage_seconds_total offset 5m
如下是范围向量的相同样本。这返回container_cpu_usage_seconds_total在一天前5分钟内的速率:
(rate(container_cpu_usage_seconds_total{pod_name="acw62egvxd95l7t3q5uxee"} [5m] offset 1d))
Pormetheus的警告由独立的两部分组成。Prometheus服务中的警告规则发送警告到Alertmanager。然后这个Alertmanager管理这些警告。包括silencing, inhibition, aggregation,以及通过一些方法发送通知,例如:email,PagerDuty和HipChat。
建立警告和通知的主要步骤:
-alertmanager.url
标志配置Alermanager地址,以便Prometheus服务能和Alertmanager建立连接。创建和配置Alertmanager:
kubectl apply -f alertmanager/
文件说明:
# tree alertmanager/
alertmanager/
├── alertmanager.conf # 配置文件
├── alertmanager.conf.base64 # 配置文件转化为base64格式
├── alertmanager-config.sh # base64 转换脚本
├── alertmanager-config.yaml # 以Secret 方式加载alertmanager 配置
├── alertmanager-service.yaml # 创建alert svc
├── alertmanager-templates-default.conf # 邮件告警通知模板
├── alertmanager-templates-slack.conf
├── alertmanager.yaml # 在K8S 中创建Alertmanager资源类型
├── default.base64
└── prometheus-k8s-service-monitor-alertmanager.yaml
查看alertmanager 管理平台:
http://10.18.19.98:30903/#/status
prometheus的配置文件名为prometheus.yml,alertmanager的配置文件名为alertmanager.yml
报警:指prometheus将监测到的异常事件发送给alertmanager,而不是指发送邮件通知
通知:指alertmanager发送异常事件的通知(邮件、webhook等)
1. 报警规则
在prometheus.yml中指定匹配报警规则的间隔
# How frequently to evaluate rules.
[ evaluation_interval: | default = 1m ]
在prometheus.yml中指定规则文件(可使用通配符,如rules/*.rules)
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
- "/etc/prometheus/alert.rules"
并基于以下模板:
ALERT
IF
[ FOR ]
[ LABELS
其中:
Alert name是警报标识符。它不需要是唯一的。
Expression是为了触发警报而被评估的条件。它通常使用现有指标作为/metrics端点返回的指标。
Duration是规则必须有效的时间段。例如,5s表示5秒。
Label set是将在消息模板中使用的一组标签。
在prometheus-k8s-statefulset.yaml 文件创建ruleSelector,标记报警规则角色。在prometheus-k8s-rules.yaml 报警规则文件中引用
ruleSelector:
matchLabels:
role: prometheus-rulefiles
prometheus: k8s
在prometheus-k8s-rules.yaml 使用configmap 方式引用prometheus-rulefiles
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-k8s-rules
namespace: monitoring
labels:
role: prometheus-rulefiles
prometheus: k8s
data:
pod.rules.yaml: |+
groups:
- name: noah_pod.rules
rules:
- alert: Pod_all_cpu_usage
expr: (sum by(name)(rate(container_cpu_usage_seconds_total{image!=""}[5m]))*100) > 10
for: 5m
labels:
severity: critical
service: pods
annotations:
description: 容器 {{ $labels.name }} CPU 资源利用率大于 75% , (current value is {{ $value }})
summary: Dev CPU 负载告警
- alert: Pod_all_memory_usage
expr: sort_desc(avg by(name)(irate(container_memory_usage_bytes{name!=""}[5m]))*100) > 1024*10^3*2
for: 10m
labels:
severity: critical
annotations:
description: 容器 {{ $labels.name }} Memory 资源利用率大于 2G , (current value is {{ $value }})
summary: Dev Memory 负载告警
- alert: Pod_all_network_receive_usage
expr: sum by (name)(irate(container_network_receive_bytes_total{container_name="POD"}[1m])) > 1024*1024*50
for: 10m
labels:
severity: critical
annotations:
description: 容器 {{ $labels.name }} network_receive 资源利用率大于 50M , (current value is {{ $value }})
summary: network_receive 负载告警
配置文件设置好后,prometheus-opeartor自动重新读取配置。
如果二次修改comfigmap 内容只需要apply:
kubectl apply -f prometheus-k8s-rules.yaml
将邮件通知与rules对比一下(还需要配置alertmanager.yml才能收到邮件):
2. 通知规则
设置alertmanager.yml的的route与receivers:
global:
# ResolveTimeout is the time after which an alert is declared resolved
# if it has not been updated.
resolve_timeout: 5m
# The smarthost and SMTP sender used for mail notifications.
smtp_smarthost: 'xxxxx'
smtp_from: 'xxxxxxx'
smtp_auth_username: 'xxxxx'
smtp_auth_password: 'xxxxxx'
# The API URL to use for Slack notifications.
slack_api_url: 'https://hooks.slack.com/services/some/api/token'
# # The directory from which notification templates are read.
templates:
- '*.tmpl'
# The root route on which each incoming alert enters.
route:
# The labels by which incoming alerts are grouped together. For example,
# multiple alerts coming in for cluster=A and alertname=LatencyHigh would
# be batched into a single group.
group_by: ['alertname', 'cluster', 'service']
# When a new group of alerts is created by an incoming alert, wait at
# least 'group_wait' to send the initial notification.
# This way ensures that you get multiple alerts for the same group that start
# firing shortly after another are batched together on the first
# notification.
group_wait: 30s
# When the first notification was sent, wait 'group_interval' to send a batch
# of new alerts that started firing for that group.
group_interval: 5m
# If an alert has successfully been sent, wait 'repeat_interval' to
# resend them.
#repeat_interval: 1m
repeat_interval: 15m
# A default receiver
# If an alert isn't caught by a route, send it to default.
receiver: default
# All the above attributes are inherited by all child routes and can
# overwritten on each.
# The child route trees.
routes:
- match:
severity: critical
receiver: email_alert
receivers:
- name: 'default'
email_configs:
- to : '[email protected]'
send_resolved: true
- name: 'email_alert'
email_configs:
- to : '[email protected]'
send_resolved: true
名词解释:
route
属性用来设置报警的分发策略,它是一个树状结构,按照深度优先从左向右的顺序进行匹配
// Match does a depth-first left-to-right search through the route tree
// and returns the matching routing nodes.
func (r *Route) Match(lset model.LabelSet) []*Route {
Alert
是alertmanager接收到的报警,类型如下:
// Alert is a generic representation of an alert in the Prometheus eco-system.
type Alert struct {
// Label value pairs for purpose of aggregation, matching, and disposition
// dispatching. This must minimally include an "alertname" label.
Labels LabelSet `json:"labels"`
// Extra key/value information which does not define alert identity.
Annotations LabelSet `json:"annotations"`
// The known time range for this alert. Both ends are optional.
StartsAt time.Time `json:"startsAt,omitempty"`
EndsAt time.Time `json:"endsAt,omitempty"`
GeneratorURL string `json:"generatorURL"`
}
具有相同Lables的Alert(key和value都相同)才会被认为是同一种。在prometheus rules文件配置的一条规则可能会产生多种报警。
Group
alertmanager会根据group_by配置将Alert分组。如下规则,当go_goroutines等于4时会收到三条报警,alertmanager会将这三条报警分成两组向receivers发出通知:
ALERT test1
IF go_goroutines > 1
LABELS {label1="l1", label2="l2", status="test"}
ALERT test2
IF go_goroutines > 2
LABELS {label1="l2", label2="l2", status="test"}
ALERT test3
IF go_goroutines > 3
LABELS {label1="l2", label2="l1", status="test"}
主要处理流程:
接收到Alert,根据labels判断属于哪些Route(可存在多个Route,一个Route有多个Group,一个Group有多个Alert)
将Alert分配到Group中,没有则新建Group
新的Group等待group_wait指定的时间(等待时可能收到同一Group的Alert),根据resolve_timeout判断Alert是否解决,然后发送通知
已有的Group等待group_interval指定的时间,判断Alert是否解决,当上次发送通知到现在的间隔大于repeat_interval或者Group有更新时会发送通知
3)Alertmanager
Alertmanager是警报的缓冲区,它具有以下特征:
可以通过特定端点(不是特定于Prometheus)接收警报。
可以将警报重定向到接收者,如hipchat、邮件或其他人。
足够智能,可以确定已经发送了类似的通知。所以,如果出现问题,你不会被成千上万的电子邮件淹没。
Alertmanager客户端(在这种情况下是Prometheus)首先发送POST消息,并将所有要处理的警报发送到/ api / v1 / alerts。例如:
[
{
"labels": {
"alertname": "low_connected_users",
"severity": "warning"
},
"annotations": {
"description": "Instance play-app:9000 under lower load",
"summary": "play-app:9000 of job playframework-app is under lower load"
}
}]
alert工作流程:
一旦这些警报存储在Alertmanager,它们可能处于以下任何状态:
Inactive:这里什么都没有发生。
Pending:客户端告诉我们这个警报必须被触发。然而,警报可以被分组、压抑/抑制或者静默/静音。一旦所有的验证都通过了,我们就转到Firing。
Firing:警报发送到Notification Pipeline,它将联系警报的所有接收者。然后客户端告诉我们警报解除,所以转换到状Inactive状态。
Prometheus有一个专门的端点,允许我们列出所有的警报,并遵循状态转换。Prometheus所示的每个状态以及导致过渡的条件如下所示:
规则不符合:
警报没有激活。
规则符合:
警报现在处于活动状态。 执行一些验证是为了避免淹没接收器的消息。
警报发送到接收者:
4)Inhibition
抑制是指当警报发出后,停止重复发送由此警报引发其他错误的警报的机制。
例如,当警报被触发,通知整个集群不可达,可以配置Alertmanager忽略由该警报触发而产生的所有其他警报,这可以防止通知数百或数千与此问题不相关的其他警报。
抑制机制可以通过Alertmanager的配置文件来配置。
Inhibition允许在其他警报处于触发状态时,抑制一些警报的通知。例如,如果同一警报(基于警报名称)已经非常紧急,那么我们可以配置一个抑制来使任何警告级别的通知静音。 alertmanager.yml文件的相关部分如下所示:
inhibit_rules:- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['low_connected_users']
配置抑制规则,是存在另一组匹配器匹配的情况下,静音其他被引发警报的规则。这两个警报,必须有一组相同的标签。
# Matchers that have to be fulfilled in the alerts to be muted.
target_match:
[ : , ... ]
target_match_re:
[ : , ... ]
# Matchers for which one or more alerts have to exist for the
# inhibition to take effect.
source_match:
[ : , ... ]
source_match_re:
[ : , ... ]
# Labels that must have an equal value in the source and target
# alert for the inhibition to take effect.
[ equal: '[' , ... ']' ]
5)Silences
Silences是快速地使警报暂时静音的一种方法。 我们直接通过Alertmanager管理控制台中的专用页面来配置它们。在尝试解决严重的生产问题时,这对避免收到垃圾邮件很有用。
参考:
alertmanager 参考资料
抑制规则 inhibit_rule参考资料
1. 应用架构案例
2. 环境需求
功能 | 组件 |
---|---|
version | v1.13.5 |
master管理方式 | kubespray |
docker version | v18.03 |
docker Storage Driver | overlay |
network plugins | flannedl(vxlan) |
服务网关 | ingress-nginx |
CI/CD | jenkins |
监控 | prometheus-operator |
日志 | efk |
rabbitmq | v3.17 |
redis-cluser | v5.0.2 |
1)version
2)master 管理方式
个人偏向使用kubespray 管理集群服务,更加灵活可塑。
rancher 适合小规模场景使用。
3)HA mode
4)docker version
5)docker Storage Driver
6)network plugins
7)服务与发现
第一种场景 user ---> service
第二种场景 service--> service
nginx-ingress 与 kong ingress 对比
kong ingress 类似一个 api-getway
推荐kong ingress。
8)ci/cd
第一阶段:
第二阶段:
9)监控
Prometheus -operator
3. 项目实施步骤
HPA 自定义监控指标弹性伸缩
由于启用了 TLS 双向认证、RBAC 授权等严格的安全机制,建议从头开始部署,而不要从中间开始,否则可能会认证、授权等失败!
部署过程中需要有很多证书的操作,请大家耐心操作。
该部署操作仅是搭建成了一个可用 kubernetes 集群,而很多地方还需要进行优化,heapster 插件、EFK 插件不一定会用于真实的生产环境中,但是通过部署这些插件,可以让大家了解到如何部署应用到集群上。
这一步是在安装配置kubernetes的所有步骤中最容易出错也最难于排查问题的一步,而这却刚好是第一步,万事开头难,不要因为这点困难就望而却步。
kubernetes 系统的各组件需要使用 TLS 证书对通信进行加密,本文档使用 CloudFlare 的 PKI 工具集 cfssl 来生成 Certificate Authority (CA) 和其它证书。
生成的 CA 证书和秘钥文件如下:
使用证书的组件如下:
kube-controller、kube-scheduler 当前需要和 kube-apiserver 部署在同一台机器上且使用非安全端口通信,故不需要证书。
注意:以下操作都在 master 节点即 172.20.200.100 这台主机上执行,证书只需要创建一次即可,以后在向集群中添加新节点时只要将 /etc/kubernetes/ 目录下的证书拷贝到新节点上即可。
1)安装 CFSSL
直接使用二进制源码包安装:
wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
chmod +x cfssl_linux-amd64
mv cfssl_linux-amd64 /root/local/bin/cfssl
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
chmod +x cfssljson_linux-amd64
mv cfssljson_linux-amd64 /root/local/bin/cfssljson
wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
chmod +x cfssl-certinfo_linux-amd64
mv cfssl-certinfo_linux-amd64 /root/local/bin/cfssl-certinfo
export PATH=/root/local/bin:$PATH
在$GOPATH/bin目录下得到以cfssl开头的几个命令。
注意:以下文章中出现的cat的文件名如果不存在需要手工创建。
2)创建 CA (Certificate Authority)
创建 CA 配置文件:
mkdir /root/ssl
cd /root/ssl
cfssl print-defaults config > config.json
cfssl print-defaults csr > csr.json
# 根据config.json文件的格式创建如下的ca-config.json文件
# 过期时间设置成了 87600h
cat > ca-config.json <
字段说明
3)创建 CA 证书签名请求
创建 ca-csr.json 文件,内容如下:
cat > ca-csr.json <
生成 CA 证书和私钥:
$ cfssl gencert -initca ca-csr.json | cfssljson -bare ca
$ ls ca*
ca-config.json ca.csr ca-csr.json ca-key.pem ca.pem
4)创建 kubernetes 证书
创建 kubernetes 证书签名请求文件 kubernetes-csr.json:
cat > kubernetes-csr.json << EOF
{
"CN": "kubernetes",
"hosts": [
"127.0.0.1",
"172.16.200.100",
"172.16.200.101",
"172.16.200.102",
"10.254.0.1",
"kubernetes",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
生成 kubernetes 证书和私钥:
$ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kubernetes-csr.json | cfssljson -bare kubernetes
$ ls kubernetes*
kubernetes.csr kubernetes-csr.json kubernetes-key.pem kubernetes.pem
5)创建 admin 证书
创建 admin 证书签名请求文件 admin-csr.json:
cat > admin-csr.json << EOF
{
"CN": "admin",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "system:masters",
"OU": "System"
}
]
}
EOF
6)生成 admin 证书和私钥
$ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin
$ ls admin*
admin.csr admin-csr.json admin-key.pem admin.pem
7)创建 kube-proxy 证书
创建 kube-proxy 证书签名请求文件 kube-proxy-csr.json
cat > kube-proxy-csr.json << EOF
{
"CN": "system:kube-proxy",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
生成 kube-proxy客户端证书和私钥:
$ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy
$ ls kube-proxy*
kube-proxy.csr kube-proxy-csr.json kube-proxy-key.pem kube-proxy.pem
8)校验证书
以 kubernetes 证书为例,使用 opsnssl 命令:
$ openssl x509 -noout -text -in kubernetes.pem
...
Signature Algorithm: sha256WithRSAEncryption
Issuer: C=CN, ST=BeiJing, L=BeiJing, O=k8s, OU=System, CN=Kubernetes
Validity
Not Before: Apr 5 05:36:00 2017 GMT
Not After : Apr 5 05:36:00 2018 GMT
Subject: C=CN, ST=BeiJing, L=BeiJing, O=k8s, OU=System, CN=kubernetes
...
X509v3 extensions:
X509v3 Key Usage: critical
Digital Signature, Key Encipherment
X509v3 Extended Key Usage:
TLS Web Server Authentication, TLS Web Client Authentication
X509v3 Basic Constraints: critical
CA:FALSE
X509v3 Subject Key Identifier:
DD:52:04:43:10:13:A9:29:24:17:3A:0E:D7:14:DB:36:F8:6C:E0:E0
X509v3 Authority Key Identifier:
keyid:44:04:3B:60:BD:69:78:14:68:AF:A0:41:13:F6:17:07:13:63:58:CD
X509v3 Subject Alternative Name:
DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster, DNS:kubernetes.default.svc.cluster.local, IP Address:127.0.0.1, IP Address:172.20.0.112, IP Address:172.20.0.113, IP Address:172.20.0.114, IP Address:172.20.0.115, IP Address:10.254.0.1
使用cfssl-certinfo命令:
$ cfssl-certinfo -cert kubernetes.pem
{
"subject": {
"common_name": "kubernetes",
"country": "CN",
"organization": "k8s",
"organizational_unit": "System",
"locality": "BeiJing",
"province": "BeiJing",
"names": [
"CN",
"BeiJing",
"BeiJing",
"k8s",
"System",
"kubernetes"
]
},
"issuer": {
"common_name": "Kubernetes",
"country": "CN",
"organization": "k8s",
"organizational_unit": "System",
"locality": "BeiJing",
"province": "BeiJing",
"names": [
"CN",
"BeiJing",
"BeiJing",
"k8s",
"System",
"Kubernetes"
]
},
"serial_number": "174360492872423263473151971632292895707129022309",
"sans": [
"kubernetes",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local",
"127.0.0.1",
"10.64.3.7",
"10.254.0.1"
],
"not_before": "2017-04-05T05:36:00Z",
"not_after": "2018-04-05T05:36:00Z",
"sigalg": "SHA256WithRSA",
9)分发证书
将生成的证书和秘钥文件(后缀名为.pem)拷贝到所有机器的 /etc/kubernetes/ssl 目录下备用:
mkdir -p /etc/kubernetes/ssl
cp *.pem /etc/kubernetes/ssl
k8s 集群使用kubectl 管理,一般在master 配置kubeconfig进行管理集群。
安装kubectl 因为kubernetes-server-linux-amd64.tar.gz server的安装包以包含了客户端管理工具,无需重新安装。
创建kubectl kubeconfig文件:
export KUBE_APISERVER="https://172.16.200.100:6443"
# 设置集群参数
kubectl config set-cluster kubernetes \
--certificate-authority=/etc/kubernetes/ssl/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER}
# 设置客户端认证参数
kubectl config set-credentials admin \
--client-certificate=/etc/kubernetes/ssl/admin.pem \
--embed-certs=true \
--client-key=/etc/kubernetes/ssl/admin-key.pem
# 设置上下文参数
kubectl config set-context kubernetes \
--cluster=kubernetes \
--user=admin
# 设置默认上下文
kubectl config use-context kubernetes
kubelet、kube-proxy 等 Node 机器上的进程与Master 机器的 kube-apiserver 进程通信时需要认证和授权的。kubernetes 1.4 开始支持由 kube-apiserver 为客户端生成 TLS 证书的 TLS Bootstrapping 功能,这样就不需要为每个客户端生成证书了;该功能当前仅支持为 kubelet 生成证书。
因为我的master节点和node节点复用,所有在这一步其实已经安装了kubectl。参考安装kubectl命令行工具。
以下操作只需要在master节点上执行,生成的*.kubeconfig文件可以直接拷贝到node节点的/etc/kubernetes目录下:
1)创建 TLS Bootstrapping Token
Token auth file:
Token可以是任意的包涵128 bit的字符串,可以使用安全的随机数发生器生成。
export BOOTSTRAP_TOKEN=$(head -c 16 /dev/urandom | od -An -t x | tr -d ' ')
cat > token.csv <
后三行是一句,直接复制上面的脚本运行即可。
注意:在进行后续操作前请检查 token.csv 文件,确认其中的 ${BOOTSTRAP_TOKEN} 环境变量已经被真实的值替换。
BOOTSTRAP_TOKEN 将被写入到 kube-apiserver 使用的 token.csv 文件和 kubelet 使用的 bootstrap.kubeconfig 文件,如果后续重新生成了 BOOTSTRAP_TOKEN,则需要:
更新 token.csv 文件,分发到所有机器 (master 和 node)的 /etc/kubernetes/ 目录下,分发到node节点上非必需;
重新生成 bootstrap.kubeconfig 文件,分发到所有 node 机器的 /etc/kubernetes/ 目录下;
重启 kube-apiserver 和 kubelet 进程;
重新 approve kubelet 的 csr 请求;
cp token.csv /etc/kubernetes/
2)创建kubelet bootstrapping kubeconfig文件
cd /etc/kubernetes
export KUBE_APISERVER="https://172.16.200.100:6443"
# 设置集群参数
kubectl config set-cluster kubernetes \
--certificate-authority=/etc/kubernetes/ssl/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=bootstrap.kubeconfig
# 设置客户端认证参数
kubectl config set-credentials kubelet-bootstrap \
--token=${BOOTSTRAP_TOKEN} \
--kubeconfig=bootstrap.kubeconfig
# 设置上下文参数
kubectl config set-context default \
--cluster=kubernetes \
--user=kubelet-bootstrap \
--kubeconfig=bootstrap.kubeconfig
# 设置默认上下文
kubectl config use-context default --kubeconfig=bootstrap.kubeconfig
3)创建 kube-proxy kubeconfig 文件
export KUBE_APISERVER="https://172.20.0.100:6443"
# 设置集群参数
kubectl config set-cluster kubernetes \
--certificate-authority=/etc/kubernetes/ssl/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=kube-proxy.kubeconfig
# 设置客户端认证参数
kubectl config set-credentials kube-proxy \
--client-certificate=/etc/kubernetes/ssl/kube-proxy.pem \
--client-key=/etc/kubernetes/ssl/kube-proxy-key.pem \
--embed-certs=true \
--kubeconfig=kube-proxy.kubeconfig
# 设置上下文参数
kubectl config set-context default \
--cluster=kubernetes \
--user=kube-proxy \
--kubeconfig=kube-proxy.kubeconfig
# 设置默认上下文
kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig
4)分发 kubeconfig 文件
将两个kubeconfig文件分发到所有 Node 机器的 /etc/kubernetes/ 目录:
cp bootstrap.kubeconfig kube-proxy.kubeconfig /etc/kubernetes/
etcd 是一个分布式一致性k-v存储系统,可用于服务注册发现与共享配置,具有以下优点:
etcd release 下载地址 页面下载最新版本的二进制文件。
etcd依赖go,需要先到官方站点下载最新go:
tar -xf go1.8.3.linux-amd64.tar.gz -C /usr/local/
cat > /etc/profile.d/go.sh <
1)安装 etcd
tar xf etcd-v3.2.7-linux-amd64.tar.gz -C /usr/local/
ln -sv /usr/local/etcd-v3.2.7-linux-amd64/ /usr/local/etcd
cat > /etc/profile.d/etcd.sh <
etcd-v3.2.1 有bug 解决方法参考链接 link
修改/etc/etcd/etcd.conf 注意修改每个node ip:
[member]
ETCD_NAME=cd-test-001.drcloud.com
ETCD_DATA_DIR="/data/lib/etcd/"
ETCD_LISTEN_CLIENT_URLS="http://0.0.0.0:2379"
ETCD_ADVERTISE_CLIENT_URLS="http://0.0.0.0:2379"
ETCD_LISTEN_PEER_URLS="http://0.0.0.0:2380"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.16.200.100:2380"
[cluster]
ETCD_INITIAL_CLUSTER="cd-test-001.drcloud.com=http://172.16.200.100:2380,cd-test-002.drcloud.com=http://172.16.200.101:2380,cd-test-003.drcloud.com=http://172.16.200.102:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
参数说明:
name 节点名称
data-dir 指定节点的数据存储目录
listen-peer-urls 监听URL,用于与其他节点通讯
listen-client-urls 对外提供服务的地址:比如 http://ip:2379,http://127.0.0.1:2379 ,客户端会连接到这里和 etcd 交互
initial-advertise-peer-urls 该节点member(同伴)监听地址,这个值会告诉集群中其他节点
initial-cluster 集群中所有节点的信息,格式为 node1=http://ip1:2380,node2=http://ip2:2380,… 。注意:这里的 node1 是节点的 --name 指定的名字;后面的 ip1:2380 是 --initial-advertise-peer-urls 指定的值
initial-cluster-state 新建集群的时候,这个值为 new ;假如已经存在的集群,这个值为 existing
initial-cluster-token 创建集群的 token,这个值每个集群保持唯一。这样的话,如果你要重新创建集群,即使配置和之前一样,也会再次生成新的集群和节点 uuid;否则会导致多个集群之间的冲突,造成未知的错误
advertise-client-urls 对外公告的该节点客户端监听地址,这个值会告诉集群中其他节点
systemd启动配置文件(默认rpm 安装的):
cat > /usr/lib/systemd/system/etcd.service << EOF
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
WorkingDirectory=/data/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.conf
User=etcd
# set GOMAXPROCS to number of processors
ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/local/etcd/etcd --name=\"${ETCD_NAME}\" --data-dir=\"${ETCD_DATA_DIR}\" --listen-client-urls=\"${ETCD_LISTEN_CLIENT_URLS}\" --initial-advertise-peer-urls=\"${ETCD_INITIAL_ADVERTISE_PEER_URLS}\" --listen-peer-urls=\"${ETCD_LISTEN_PEER_URLS}\" --initial-cluster=\"${ETCD_INITIAL_CLUSTER}\" --advertise-client-urls=\"${ETCD_ADVERTISE_CLIENT_URLS}\" --initial-cluster-token=\"${ETCD_INITIAL_CLUSTER_TOKEN}\" --initial-cluster-state=\"${ETCD_INITIAL_CLUSTER_STATE}\""
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
配置防火墙规则:
firewall-cmd --zone=public --add-port=2379/tcp --permanent
firewall-cmd --zone=public --add-port=2380/tcp --permanent
2)etcd集群运行过程中的改配
主要用于故障节点替换,集群扩容需求。
集群运行过程中的改配不区分static和discovery方式。
1、替换的步骤:
比如集群中某一member重启后仍不能恢复时,就需要替换一个新member进来。
a、从集群中删除老member
etcdctl member remove 1609b5a3a078c227
b、向集群中新增新member
etcdctl member add
例子:
etcdctl member add etcd2 http://192.168.2.56:2380
输入命令后返回如下信息:
Added member named etcd2 with ID b7d510356ee2e68b to cluster
ETCD_NAME="etcd2"
ETCD_INITIAL_CLUSTER="etcd0=http://192.168.2.55:2380,etcd2=http://192.168.2.56:2380,etcd1=http://192.168.2.54:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"
c、删除老节点data目录
如果不删除,启动后节点仍旧沿用之前的老id, 其他正常节点不认,不能建立联系
d、在新member上启动etcd进程
2、扩容的步骤:
执行上面第b,d两步即可。
etcd 集群健康检查:
curl http://172.16.200.206:2379/health
etcdctl cluster-health
member 37460b828c8625a0 is healthy: got healthy result from http://0.0.0.0:2379
member 52b98a730bb6d77c is healthy: got healthy result from http://172.16.200.208:2379
member 7f453c22b9758161 is healthy: got healthy result from http://172.16.200.206:2379
kubernetes master 节点包含的组件:
目前这三个组件需要部署在同一台机器上。
1)验证TLS 证书文件及token.csv文件
pem和token.csv证书文件我们在TLS证书和秘钥这一步中已经创建过了。
我们再检查一下:
ls /etc/kubernetes/ssl
admin-key.pem admin.pem ca-key.pem ca.pem kube-proxy-key.pem kube-proxy.pem kubernetes-key.pem kubernetes.pem
ls /etc/kubernetes/token.csv
2)下载最新版本的二进制文件
从 github release 页面 下载发布版 tarball,解压后再执行下载脚本:
wget https://dl.k8s.io/v1.7.6/kubernetes-server-linux-amd64.tar.gz
tar xf /root/k8s/kubernetes-server-linux-amd64.tar.gz -C /usr/local/
cat > /etc/profile.d/kube-apiserver.sh << EOF
export PATH=/usr/local/kubernetes/server/bin:$PATH
EOF
3)配置和启动 kube-apiserver
serivce配置文件/usr/lib/systemd/system/kube-apiserver.service内容:
cat /usr/lib/systemd/system/kube-apiserver.service
[Unit]
Description=Kubernetes API Service
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target
After=etcd.service
[Service]
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/apiserver
ExecStart=/usr/local/kubernetes/server/bin/kube-apiserver \
$KUBE_LOGTOSTDERR \
$KUBE_LOG_LEVEL \
$KUBE_ETCD_SERVERS \
$KUBE_API_ADDRESS \
$KUBE_API_PORT \
$KUBELET_PORT \
$KUBE_ALLOW_PRIV \
$KUBE_SERVICE_ADDRESSES \
$KUBE_ADMISSION_CONTROL \
$KUBE_API_ARGS
Restart=on-failure
Type=notify
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
/etc/kubernetes/config文件的内容为:
cat config
###
# kubernetes system config
#
# The following values are used to configure various aspects of all
# kubernetes services, including
#
# kube-apiserver.service
# kube-controller-manager.service
# kube-scheduler.service
# kubelet.service
# kube-proxy.service
# logging to stderr means we get it in the systemd journal
KUBE_LOGTOSTDERR="--logtostderr=true"
# journal message level, 0 is debug
KUBE_LOG_LEVEL="--v=0"
# Should this cluster be allowed to run privileged docker containers
KUBE_ALLOW_PRIV="--allow-privileged=true"
# How the controller-manager, scheduler, and proxy find the apiserver
#KUBE_MASTER="--master=http://sz-pg-oam-docker-test-001.tendcloud.com:8080"
KUBE_MASTER="--master=http://172.16.200.216:8080"
注意:该配置文件同时被kube-apiserver、kube-controller-manager、kube-scheduler、kubelet、kube-proxy使用。
apiserver配置文件/etc/kubernetes/apiserver内容为:
cat apiserver
###
## kubernetes system config
##
## The following values are used to configure the kube-apiserver
##
#
## The address on the local server to listen to.
#KUBE_API_ADDRESS="--insecure-bind-address=sz-pg-oam-docker-test-001.tendcloud.com"
KUBE_API_ADDRESS="--advertise-address=172.16.200.216 --bind-address=172.16.200.216 --insecure-bind-address=172.16.200.216"
#
## The port on the local server to listen on.
#KUBE_API_PORT="--port=8080"
#
## Port minions listen on
#KUBELET_PORT="--kubelet-port=10250"
#
## Comma separated list of nodes in the etcd cluster
KUBE_ETCD_SERVERS="--etcd-servers=http://172.16.200.100:2379,http://172.16.200.101:2379,http://172.16.200.102:2379"
#
## Address range to use for services
KUBE_SERVICE_ADDRESSES="--service-cluster-ip-range=10.254.0.0/16"
#
## default admission control policies
KUBE_ADMISSION_CONTROL="--admission-control=ServiceAccount,NamespaceLifecycle,NamespaceExists,LimitRanger,ResourceQuota"
#
## Add your own!
KUBE_API_ARGS="--authorization-mode=RBAC --runtime-config=rbac.authorization.k8s.io/v1beta1 --kubelet-https=true --experimental-bootstrap-token-auth --token-auth-file=/etc/kubernetes/token.csv --service-node-port-range=30000-32767 --tls-cert-file=/etc/kubernetes/ssl/kubernetes.pem --tls-private-key-file=/etc/kubernetes/ssl/kubernetes-key.pem --client-ca-file=/etc/kubernetes/ssl/ca.pem --service-account-key-file=/etc/kubernetes/ssl/ca-key.pem --etcd-cafile=/etc/kubernetes/ssl/ca.pem --etcd-certfile=/etc/kubernetes/ssl/kubernetes.pem --etcd-keyfile=/etc/kubernetes/ssl/kubernetes-key.pem --enable-swagger-ui=true --apiserver-count=3 --audit-log-maxage=30 --audit-log-maxbackup=3 --audit-log-maxsize=100 --audit-log-path=/var/lib/audit.log --event-ttl=1h --log-dir=/data/logs/kubernetes/ --v=2 --logtostderr=false"
说明:
启动kube-apiserver:
systemctl daemon-reload
systemctl enable kube-apiserver
systemctl start kube-apiserver
systemctl status kube-apiserver
4)配置和启动 kube-controller-manager
创建 kube-controller-manager的serivce配置文件:
cat /usr/lib/systemd/system/kube-controller-manager.service
Description=Kubernetes Controller Manager
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
[Service]
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/controller-manager
ExecStart=/usr/local/kubernetes/server/bin/kube-controller-manager \
$KUBE_LOGTOSTDERR \
$KUBE_LOG_LEVEL \
$KUBE_MASTER \
$KUBE_CONTROLLER_MANAGER_ARGS
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
配置文件/etc/kubernetes/controller-manager:
cat controller-manager
###
# The following values are used to configure the kubernetes controller-manager
# defaults from config and apiserver should be adequate
# Add your own!
KUBE_CONTROLLER_MANAGER_ARGS="--address=127.0.0.1 --service-cluster-ip-range=10.254.0.0/16 --cluster-name=kubernetes --cluster-signing-cert-file=/etc/kubernetes/ssl/ca.pem --cluster-signing-key-file=/etc/kubernetes/ssl/ca-key.pem --service-account-private-key-file=/etc/kubernetes/ssl/ca-key.pem --root-ca-file=/etc/kubernetes/ssl/ca.pem --leader-elect=true --log-dir=/data/logs/kubernetes/ --v=2 --logtostderr=false"
说明:
启动 kube-controller-manager:
systemctl daemon-reload
systemctl enable kube-controller-manager
systemctl start kube-controller-manager
5)配置和启动 kube-scheduler
创建 kube-scheduler的serivce配置文件:
cat /usr/lib/systemd/system/kube-scheduler.service
[Unit]
Description=Kubernetes Scheduler Plugin
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
[Service]
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/scheduler
ExecStart=/usr/local/kubernetes/server/bin/kube-scheduler \
$KUBE_LOGTOSTDERR \
$KUBE_LOG_LEVEL \
$KUBE_MASTER \
$KUBE_SCHEDULER_ARGS
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
kube-scheduler配置文件:
###
# kubernetes scheduler config
# default config should be adequate
# Add your own!
KUBE_SCHEDULER_ARGS="--leader-elect=true --address=127.0.0.1 --log-dir=/data/logs/kubernetes/ --v=2 --logtostderr=false"
EOF
启动 kube-scheduler:
systemctl daemon-reload
systemctl enable kube-scheduler
systemctl start kube-scheduler
6)验证 master 节点功能
kubectl get componentstatuses
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health": "true"}
etcd-1 Healthy {"health": "true"}
etcd-2 Healthy {"health": "true"}
1)部署node节点
kubernetes node 节点包含如下组件:
注意:每台 node 上都需要安装 flannel,master 节点上可以不必安装。
2)检查目录和文件
我们再检查一下三个节点上,经过前几步操作生成的配置文件。
# ls /etc/kubernetes/ssl/
admin-key.pem admin.pem ca-key.pem ca.pem kube-proxy-key.pem kube-proxy.pem kubernetes-key.pem kubernetes.pem
# ls /etc/kubernetes/
apiserver bootstrap.kubeconfig config controller-manager kube-proxy.kubeconfig scheduler ssl token.csv
3)flannel 网络架构图
4)配置安装Flanneld
默认使用yum 安装,需要替换二进制 flanneld。
下载flanned-0.8 binary:
yum install flanneld -y
wget https://github.com/coreos/flannel/releases/download/v0.8.0/flanneld-amd64
chmod +x flanneld-amd64
cp flanneld-amd64 /usr/bin/flanneld
service配置文件/usr/lib/systemd/system/flanneld.service
cat /usr/lib/systemd/system/flanneld.service
[Unit]
Description=Flanneld overlay address etcd agent
After=network.target
After=network-online.target
Wants=network-online.target
After=etcd.service
Before=docker.service
[Service]
Type=notify
EnvironmentFile=/etc/sysconfig/flanneld
EnvironmentFile=-/etc/sysconfig/docker-network
ExecStart=/usr/bin/flanneld-start $FLANNEL_OPTIONS
ExecStartPost=/usr/libexec/flannel/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker
Restart=on-failure
[Install]
WantedBy=multi-user.target
RequiredBy=docker.service
/etc/sysconfig/flanneld配置文件:
cat /etc/sysconfig/flanneld
# Flanneld configuration options
# etcd url location. Point this to the server where etcd runs
FLANNEL_ETCD_ENDPOINTS="http://172.16.200.100:2379,http://172.16.200.101:2379,http://172.16.200.102:2379"
# etcd config key. This is the configuration key that flannel queries
# For address range assignment
ETCD_PREFIX="/kube-centos/network"
FLANNEL_ETCD_KEY="/kube-centos/network"
ACCESS_KEY_ID=XXXXXXX
ACCESS_KEY_SECRET=XXXXXXX
# Any additional options that you want to pass
#FLANNEL_OPTIONS=" -iface=eth0 -log_dir=/data/logs/kubernetes --logtostderr=false --v=2"
#FLANNEL_OPTIONS="-etcd-cafile=/etc/kubernetes/ssl/ca.pem -etcd-certfile=/etc/kubernetes/ssl/kubernetes.pem -etcd-keyfile=/etc/kubernetes/ssl/kubernetes-key.pem"
5)在etcd中创建网络配置
执行下面的命令为docker分配IP地址段:
etcdctl mkdir /kube-centos/network
etcdctl mk /kube-centos/network/config '{"Network":"10.24.0.0/16","Backend":{"Type":"ali-vpc"}}'
etcdctl mk /kube-centos/network/config '{"Network":"10.24.0.0/16","Backend":{"Type":"host-gw"}}'
安装kubernetes-cni依赖kublete 强制安装:
rpm -ivh --force kubernetes-cni-0.5.1-0.x86_64.rpm --nodeps
配置cni 插件:
mkdir -p /etc/cni/net.d
cat > /etc/cni/net.d/10-flannel.conf << EOF
{
"name": "cbr0",
"type": "flannel",
"delegate": {
"isDefaultGateway": true,
"forceAddress": true,
"bridge": "cni0",
"mtu": 1500
}
}
EOF
内核参数修改:
cat /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
sysctl -p /etc/sysctl.d/k8s.conf
查看获取的地址段:
#etcdctl ls /kube-centos/network/subnets
/kube-centos/network/subnets/10.24.15.0-24
/kube-centos/network/subnets/10.24.38.0-24
6)安装和配置 kubelet
kubelet 启动时向 kube-apiserver 发送 TLS bootstrapping 请求,需要先将 bootstrap token 文件中的 kubelet-bootstrap 用户赋予system:node-bootstrapper cluster 角色(role), 然后 kubelet 才能有权限创建认证请求(certificate signing requests):
cd /etc/kubernetes
kubectl create clusterrolebinding kubelet-bootstrap \
--clusterrole=system:node-bootstrapper \
--user=kubelet-bootstrap
创建 kubelet 的service配置文件:
# cat /usr/lib/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=docker.service
Requires=docker.service
[Service]
WorkingDirectory=/var/lib/kubelet
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/kubelet
ExecStart=/usr/local/kubernetes/server/bin/kubelet \
$KUBE_LOGTOSTDERR \
$KUBE_LOG_LEVEL \
$KUBELET_API_SERVER \
$KUBELET_ADDRESS \
$KUBELET_PORT \
$KUBELET_HOSTNAME \
$KUBE_ALLOW_PRIV \
$KUBELET_POD_INFRA_CONTAINER \
$KUBELET_ARGS
Restart=on-failure
[Install]
WantedBy=multi-user.target
kubelet的配置文件/etc/kubernetes/kubelet。其中的IP地址更改为你的每台node节点的IP地址。
注意:/var/lib/kubelet需要手动创建。
kubelet 配置文件:
cat > /etc/kubernetes/kubelet << EOF
###
### kubernetes kubelet (minion) config
##
### The address for the info server to serve on (set to 0.0.0.0 or "" for all interfaces)
KUBELET_ADDRESS="--address=172.16.200.100"
##
### The port for the info server to serve on
##KUBELET_PORT="--port=10250"
##
### You may leave this blank to use the actual hostname
KUBELET_HOSTNAME="--hostname-override=172.16.200.100"
##
### location of the api-server
#KUBELET_API_SERVER="--api-servers=http://172.16.200.100:8080"
##
### pod infrastructure container
KUBELET_POD_INFRA_CONTAINER="--pod-infra-container-image=gcr.io/google_containers/pause:latest"
##
### Add your own!
KUBELET_ARGS="--cgroup-driver=systemd --cluster-dns=10.254.0.2 --experimental-bootstrap-kubeconfig=/etc/kubernetes/bootstrap.kubeconfig --kubeconfig=/etc/kubernetes/kubelet.kubeconfig --require-kubeconfig --cert-dir=/etc/kubernetes/ssl --cluster-domain=cluster.local --hairpin-mode promiscuous-bridge --serialize-image-pulls=false --network-plugin=cni --cni-conf-dir=/etc/cni/net.d/ --cni-bin-dir=/opt/cni/bin/ --network-plugin-mtu=1500 --log-dir=/data/logs/kubernetes/ --v=2 --logtostderr=false"
EOF
kubelet 依赖启动配置文件 bootstrap.kubeconfig:
systemctl daemon-reload
systemctl enable kubelet
systemctl start kubelet
systemctl status kubelet
7)通过 kublet 的 TLS 证书请求
kubelet 首次启动时向 kube-apiserver 发送证书签名请求,必须通过后 kubernetes 系统才会将该 Node 加入到集群。
查看未授权的CSR请求:
# kubectl get csr
NAME AGE REQUESTOR CONDITION
node-csr-8I8soRqLhxiH2nThkgUsL2oIaKyh15AuNOVgJddWBqA 2s kubelet-bootstrap Pending
node-csr-9byGSZPAX0eT60qME8_2PIZ0Q4GkDTFG-1tvPhVaH40 49d kubelet-bootstrap Approved,Issued
node-csr-DpvCEHT98ARavxjdLpa_yl_aNGddNTAX07MEVSAjnUM 4d kubelet-bootstrap Approved,Issued
node-csr-nAOtjarW3mJ3boQ3AtaeGCbQYbW_jo8AGscFnk1uxqw 8d kubelet-bootstrap Approved,Issued
node-csr-sgI8CYnTFQZqaZg9wdJP6OabqBiNA0DpZ5Z0wCCl4bQ 54d kubelet-bootstrap Approved,Issued
通过CSR请求:
kubectl certificate approve node-csr-8I8soRqLhxiH2nThkgUsL2oIaKyh15AuNOVgJddWBqA
通过node查看 :
kubectl get node
NAME STATUS AGE VERSION
172.16.200.206 Ready 11m v1.7.6
172.16.200.209 Ready 49d v1.7.6
172.16.200.216 Ready 4d v1.7.6
自动生成了 kubelet.kubeconfig 文件和公私钥:
ls -l /etc/kubernetes/kubelet.kubeconfig
注意:假如你更新kubernetes的证书,只要没有更新token.csv,当重启kubelet后,该node就会自动加入到kuberentes集群中,而不会重新发送certificaterequest,也不需要在master节点上执行kubectl certificate approve操作。
前提是不要删除node节点上的/etc/kubernetes/ssl/kubelet*和/etc/kubernetes/kubelet.kubeconfig文件,否则kubelet启动时会提示找不到证书而失败。
8)配置 kube-proxy
创建 kube-proxy 的service配置文件。
文件路径/usr/lib/systemd/system/kube-proxy.service:
cat > /usr/lib/systemd/system/kube-proxy.service << EOF
[Unit]
Description=Kubernetes Kube-Proxy Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target
[Service]
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/proxy
ExecStart=/usr/local/kubernetes/server/bin/kube-proxy \
$KUBE_LOGTOSTDERR \
$KUBE_LOG_LEVEL \
$KUBE_MASTER \
$KUBE_PROXY_ARGS
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
kube-proxy配置文件/etc/kubernetes/proxy:
cat > /etc/kubernetes/proxy << EOF
###
# kubernetes proxy config
# default config should be adequate
# Add your own!
KUBE_PROXY_ARGS="--bind-address=172.16.200.100 --hostname-override=172.16.200.100 --kube-api-burst=50 --kube-api-qps=20 --master=http://172.16.200.100:8080 --kubeconfig=/etc/kubernetes/kube-proxy.kubeconfig --cluster-cidr=10.254.0.0/16 --log-dir=/data/logs/kubernetes/ --v=2 --logtostderr=false"
EOF
启动 kube-proxy:
systemctl daemon-reload
systemctl enable kube-proxy
systemctl start kube-proxy
systemctl status kube-proxy
创建资源对象:
1)根据yaml 配置文件一次性创建service、rc
kubectl create -f my-service.yaml -f my-rc.yaml
2)查看资源对象
查看所有pod 列表:
kubectl get pod -n
查看RC和service 列表
kubectl get rc,svc
3) 描述资源对象
显示Node的详细信息:
kubectl describe node
显示Pod的详细信息:
kubectl describe pod
4)删除资源对象
基于pod.yaml 定义的名称删除pod:
kubectl delete -f pod.yaml
删除所有包含某个label的pod 和service:
kubectl delete pod,svc -l name=
删除所有Pod:
kubectl delete pod --all
5)执行容器的命令
执行pod的date命令:
kubectl exec -- date
通过bash 获得pod中某个容器的TTY,相当于登陆容器:
kubectl exec -it -c -- bash
6) 查看容器的日志
kubectl logs
1)创建redis-master-controller.yaml
apiVersion: v1
kind: ReplicationController
metadata:
name: redis-master
labels:
name: redis-master
spec:
replicas: 1
selector:
name: redis-master
template:
metadata:
labels:
name: redis-master
spec:
containers:
- name: master
image: kubeguide/redis-master
ports:
- containerPort: 6379
发布到kubernetes集群,自动创建pod:
kubectl create -f redis-master-controller.yaml
kubectl get rc
kubectl get pods
2)创建redis-master-service.yaml
apiVersion: v1
kind: Service
metadata:
name: redis-master
labels:
name: redis-master
spec:
ports:
- port: 6379
targetPort: 6379
selector:
name: redis-master
创建service:
kubectl create -f redis-master-service.yaml
kubectl get services
3)创建redis-slave-controller.yaml
apiVersion: v1
kind: ReplicationController
metadata:
name: redis-slave
spec:
replicas: 2
selector:
name: redis-slave
template:
metadata:
name: redis-slave
labels:
name: redis-slave
spec:
containers:
- name: redis-slave
image: kubeguide/guestbook-redis-slave
env:
- name: GET_HOSTS_FROM
value: env
ports:
- containerPort: 6379
创建redis-slave:
kubectl create -f redis-slave-controller.yaml
kubectl get rc
kubectl get pods
4)创建redis-slave-service.yaml
apiVersion: v1
kind: Service
metadata:
name: redis-slave
labels:
name: redis-slave
spec:
ports:
- port: 6379
selector:
name: redis-slave
创建 redis-slave service:
kubectl create -f redis-slave-service.yaml
kubectl get services
5)创建frontend-controller.yaml
apiVersion: v1
kind: ReplicationController
metadata:
name: frontend
labels:
name: frontend
spec:
replicas: 3
selector:
name: frontend
template:
metadata:
labels:
name: frontend
spec:
containers:
- name: frontend
image: kubeguide/guestbook-php-frontend
env:
- name: GET_HOSTS_FROM
value: env
ports:
- containerPort: 80
创建:
kubectl create -f frontend-controller.yaml
kubectl get rc
kubectl get pods
6)创建frontend-service.yaml
apiVersion: v1
kind: Service
metadata:
name: frontend
labels:
name: frontend
spec:
type: NodePort
ports:
- port: 80
nodePort: 30001
selector:
name: frontend
创建:
kubectl create -f frontend-service.yaml
kubectl get services
Kubernetes调度器根据特定的算法与策略将pod调度到工作节点上。在默认情况下,Kubernetes调度器可以满足绝大多数需求,例如调度pod到资源充足的节点上运行,或调度pod分散到不同节点使集群节点资源均衡等。但一些特殊的场景,默认调度算法策略并不能满足实际需求,例如使用者期望按需将某些pod调度到特定硬件节点(数据库服务部署到SSD硬盘机器、CPU/内存密集型服务部署到高配CPU/内存服务器),或就近部署交互频繁的pod(例如同一机器、同一机房、或同一网段等)。
Kubernetes中的调度策略主要分为全局调度与运行时调度2种。其中全局调度策略在调度器启动时配置,而运行时调度策略主要包括选择节点(nodeSelector),节点亲和性(nodeAffinity),pod亲和与反亲和性(podAffinity与podAntiAffinity)。Node Affinity、podAffinity/AntiAffinity以及后文即将介绍的污点(Taints)与容忍(tolerations)等特性,在Kuberntes1.6中均处于Beta阶段。
node 添加标签:
kubectl label nodes 172.16.200.101 disktype=ssd
查看node 节点标签(label):
kubectl get node --show-labels
删除一个label:
kubectl label node 172.16.200.101 disktype-
修改一个label:
# kubectl label node 172.16.200.101 disktype=scsi --overwrite
1)选择节点(nodeselector)
apiVersion: v1
kind: ReplicationController
metadata:
name: redis-master
labels:
name: redis-master
spec:
replicas: 1
selector:
name: redis-master
template:
metadata:
labels:
name: redis-master
spec:
containers:
- name: master
image: kubeguide/redis-master
ports:
- containerPort: 6379
nodeSelector:
disktype: ssd
2)亲和性(Affinity)与非亲和性(anti-affinity)
前面提及的nodeSelector,其仅以一种非常简单的方式、即label强制限制pod调度到指定节点。而亲和性(Affinity)与非亲和性(anti-affinity)则更加灵活的指定pod调度到预期节点上,相比nodeSelector,Affinity与anti-affinity优势体现在:
亲和性主要分为3种类型:node affinity与inter-pod affinity/anti-affinity,下文会进行详细说明。
(1)节点亲和性(Node affinity)
Node affinity在Kubernetes 1.2做为alpha引入,其涵盖了nodeSelector功能,主要分为requiredDuringSchedulingIgnoredDuringExecution与preferredDuringSchedulingIgnoredDuringExecution 2种类型。前者可认为一种强制限制,如果 Node 的标签发生了变化导致其没有符合 Pod 的调度要求节点,那么pod调度就会失败。而后者可认为理解为软限或偏好,同样如果 Node 的标签发生了变化导致其不再符合 pod 的调度要求,pod 依然会调度运行。
Node affinity举例
设置节点label:
kubectl label node 172.16.200.100 cpu=high
kubectl label node 172.16.200.101 cpu=mid
kubectl label node 172.16.200.102 cpu=low
部署pod的预期是到CPU高配的机器上(cpu=high)。
查看满足条件节点:
kubectl get nodes -l 'cpu=high'
redis-master.yaml 文件内容如下:
apiVersion: v1
kind: ReplicationController
metadata:
name: redis-master
labels:
name: redis-master
spec:
replicas: 1
selector:
name: redis-master
template:
metadata:
labels:
name: redis-master
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: cpu
operator: In
values:
- high
containers:
- name: master
image: kubeguide/redis-master
ports:
- containerPort: 6379
检查结果符合预期,pod nginx成功部署到非master节点且CPU高配的机器上:
# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE
redis-master-lbz9f 1/1 Running 0 11s 10.24.77.4 172.16.200.100
(2)pod亲和性(Inter-pod affinity)与反亲和性(anti-affinity)
inter-pod affinity与anti-affinity由Kubernetes 1.4引入,当前处于beta阶段,其中podAffinity用于调度pod可以和哪些pod部署在同一拓扑结构之下。而podAntiAffinity相反,其用于规定pod不可以和哪些pod部署在同一拓扑结构下。通过pod affinity与anti-affinity来解决pod和pod之间的关系。
与Node affinity类似,pod affinity与anti-affinity同样分为requiredDuringSchedulingIgnoredDuringExecution and preferredDuringSchedulingIgnoredDuringExecution等2种类型,前者被认为是强制约束,而后者后者可认为理解软限(soft)或偏好(preference)。
pod affinity与anti-affinity举例
本示例中假设部署场景为:期望redis-slave服务与redis-master服务就近部署,而不希望与frontend服务部署同一拓扑结构上。
redis-slave yaml文件部分内容:
apiVersion: v1
kind: ReplicationController
metadata:
name: redis-slave
spec:
replicas: 2
selector:
name: redis-slave
template:
metadata:
name: redis-slave
labels:
name: redis-slave
spec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: name
operator: In
values:
- redis-master
topologyKey: kubernetes.io/hostname
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: name
operator: In
values:
- frontend
topologyKey: beta.kubernetes.io/os
containers:
- name: redis-slave
image: kubeguide/guestbook-redis-slave
env:
- name: GET_HOSTS_FROM
value: env
ports:
- containerPort: 6379
查看部署结果,redis-slave服务与redis-master部署到了同一台机器,而frontend被部署在其他机器上:
# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE
frontend-4nmkz 1/1 Running 0 4m 10.24.51.5 172.16.200.101
frontend-xmjsr 1/1 Running 0 4m 10.24.77.5 172.16.200.100
redis-master-lbz9f 1/1 Running 0 41m 10.24.77.4 172.16.200.100
redis-slave-t9tw4 1/1 Running 3 1m 10.24.77.7 172.16.200.100
redis-slave-zvcrg 1/1 Running 3 1m 10.24.77.6 172.16.200.100
亲和性/反亲和性调度策略比较:
调度策略 | 匹配标签 | 操作符 | 拓扑域支持 | 调度目标 |
---|---|---|---|---|
nodeAffinity | 主机 | In, NotIn, Exists, DoesNotExist, Gt, Lt | 否 | pod到指定主机 |
podAffinity | Pod | In, NotIn, Exists, DoesNotExist | 是 | pod与指定pod同一拓扑域 |
PodAntiAffinity | Pod | In, NotIn, Exists, DoesNotExist | 是 | pod与指定pod非同一拓扑域 |
实现原理:
kubedns容器的功能:
dnsmasq容器的功能:
exec-healthz容器的功能:
备注:kube-dns组件 github 下载地址
1)创建kubedns-cm.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: kube-dns
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: EnsureExists
2)对比kubedns-dns-controller配置文件修改
diff kubedns-controller.yaml.sed /root/kubernetes/k8s-deploy/mainifest/dns/kubedns-controller.yaml
58c58
< image: gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.5
---
> image: gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.4
88c88
< - --domain=$DNS_DOMAIN.
---
> - --domain=cluster.local.
109c109
< image: gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.5
---
> image: gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.4
128c128
< - --server=/$DNS_DOMAIN/127.0.0.1#10053
---
> - --server=/cluster.local/127.0.0.1#10053
147c147
< image: gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.5
---
> image: gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.4
160,161c160,161
< - --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.$DNS_DOMAIN,5,A
< - --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.$DNS_DOMAIN,5,A
---
> - --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local,5,A
> - --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local,5,A
3)创建kubedns-dns-controller
# Should keep target in cluster/addons/dns-horizontal-autoscaler/dns-horizontal-autoscaler.yaml
# in sync with this file.
# Warning: This is a file generated from the base underscore template file: kubedns-controller.yaml.base
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: kube-dns
namespace: kube-system
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
# replicas: not specified here:
# 1. In order to make Addon Manager do not reconcile this replicas parameter.
# 2. Default is 1.
# 3. Will be tuned in real time if DNS horizontal auto-scaling is turned on.
strategy:
rollingUpdate:
maxSurge: 10%
maxUnavailable: 0
selector:
matchLabels:
k8s-app: kube-dns
template:
metadata:
labels:
k8s-app: kube-dns
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
volumes:
- name: kube-dns-config
configMap:
name: kube-dns
optional: true
containers:
- name: kubedns
image: gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.4
resources:
# TODO: Set memory limits when we've profiled the container for large
# clusters, then set request = limit to keep this container in
# guaranteed class. Currently, this container falls into the
# "burstable" category so the kubelet doesn't backoff from restarting it.
limits:
memory: 170Mi
requests:
cpu: 100m
memory: 70Mi
livenessProbe:
httpGet:
path: /healthcheck/kubedns
port: 10054
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
readinessProbe:
httpGet:
path: /readiness
port: 8081
scheme: HTTP
# we poll on pod startup for the Kubernetes master service and
# only setup the /readiness HTTP server once that's available.
initialDelaySeconds: 3
timeoutSeconds: 5
args:
- --domain=cluster.local.
- --dns-port=10053
- --config-dir=/kube-dns-config
- --v=2
env:
- name: PROMETHEUS_PORT
value: "10055"
ports:
- containerPort: 10053
name: dns-local
protocol: UDP
- containerPort: 10053
name: dns-tcp-local
protocol: TCP
- containerPort: 10055
name: metrics
protocol: TCP
volumeMounts:
- name: kube-dns-config
mountPath: /kube-dns-config
- name: dnsmasq
image: gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.4
livenessProbe:
httpGet:
path: /healthcheck/dnsmasq
port: 10054
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
args:
- -v=2
- -logtostderr
- -configDir=/etc/k8s/dns/dnsmasq-nanny
- -restartDnsmasq=true
- --
- -k
- --cache-size=1000
- --log-facility=-
- --server=/cluster.local/127.0.0.1#10053
- --server=/in-addr.arpa/127.0.0.1#10053
- --server=/ip6.arpa/127.0.0.1#10053
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
# see: https://github.com/kubernetes/kubernetes/issues/29055 for details
resources:
requests:
cpu: 150m
memory: 20Mi
volumeMounts:
- name: kube-dns-config
mountPath: /etc/k8s/dns/dnsmasq-nanny
- name: sidecar
image: gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.4
livenessProbe:
httpGet:
path: /metrics
port: 10054
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
args:
- --v=2
- --logtostderr
- --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local,5,A
- --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local,5,A
ports:
- containerPort: 10054
name: metrics
protocol: TCP
resources:
requests:
memory: 20Mi
cpu: 10m
dnsPolicy: Default # Don't use cluster DNS.
serviceAccountName: kube-dns
4)创建 kubedns-svc.yaml
apiVersion: v1
kind: Service
metadata:
name: kube-dns
namespace: kube-system
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "KubeDNS"
spec:
selector:
k8s-app: kube-dns
clusterIP: 10.254.0.2
ports:
- name: dns
port: 53
protocol: UDP
- name: dns-tcp
port: 53
protocol: TCP
5)创建 kubedns-sa.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: kube-dns
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
6)部署dns-horizontal-autoscaler
创建dns-horizontal-autoscaler-rbac.yaml:
kind: ServiceAccount
apiVersion: v1
metadata:
name: kube-dns-autoscaler
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: Reconcile
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: system:kube-dns-autoscaler
labels:
addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups: [""]
resources: ["nodes"]
verbs: ["list"]
- apiGroups: [""]
resources: ["replicationcontrollers/scale"]
verbs: ["get", "update"]
- apiGroups: ["extensions"]
resources: ["deployments/scale", "replicasets/scale"]
verbs: ["get", "update"]
# Remove the configmaps rule once below issue is fixed:
# kubernetes-incubator/cluster-proportional-autoscaler#16
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get", "create"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: system:kube-dns-autoscaler
labels:
addonmanager.kubernetes.io/mode: Reconcile
subjects:
- kind: ServiceAccount
name: kube-dns-autoscaler
namespace: kube-system
roleRef:
kind: ClusterRole
name: system:kube-dns-autoscaler
apiGroup: rbac.authorization.k8s.io
创建dns-horizontal-autoscaler.yaml:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: kube-dns-autoscaler
namespace: kube-system
labels:
k8s-app: kube-dns-autoscaler
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
template:
metadata:
labels:
k8s-app: kube-dns-autoscaler
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
containers:
- name: autoscaler
image: gcr.io/google_containers/cluster-proportional-autoscaler-amd64:1.1.2-r2
resources:
requests:
cpu: "20m"
memory: "10Mi"
command:
- /cluster-proportional-autoscaler
- --namespace=kube-system
- --configmap=kube-dns-autoscaler
# Should keep target in sync with cluster/addons/dns/kubedns-controller.yaml.base
- --target=Deployment/kube-dns
# When cluster is using large nodes(with more cores), "coresPerReplica" should dominate.
# If using small nodes, "nodesPerReplica" should dominate.
- --default-params={"linear":{"coresPerReplica":256,"nodesPerReplica":16,"preventSinglePointFailure":true}}
- --logtostderr=true
- --v=2
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
serviceAccountName: kube-dns-autoscaler
7)创建dashboard-rbac
apiVersion: v1
kind: ServiceAccount
metadata:
name: dashboard
namespace: kube-system
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: dashboard
subjects:
- kind: ServiceAccount
name: dashboard
namespace: kube-system
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
8)创建dashboard-controller
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: kubernetes-dashboard
namespace: kube-system
labels:
k8s-app: kubernetes-dashboard
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
selector:
matchLabels:
k8s-app: kubernetes-dashboard
template:
metadata:
labels:
k8s-app: kubernetes-dashboard
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
containers:
- name: kubernetes-dashboard
image: gcr.io/google_containers/kubernetes-dashboard-amd64:v1.6.1
resources:
# keep request = limit to keep this container in guaranteed class
limits:
cpu: 100m
memory: 300Mi
requests:
cpu: 100m
memory: 100Mi
ports:
- containerPort: 9090
args:
- --apiserver-host=http://172.16.200.100:8080
livenessProbe:
httpGet:
path: /
port: 9090
initialDelaySeconds: 30
timeoutSeconds: 30
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
9)创建dashboard-service
apiVersion: v1
kind: Service
metadata:
name: kubernetes-dashboard
namespace: kube-system
labels:
k8s-app: kubernetes-dashboard
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
type: NodePort
selector:
k8s-app: kubernetes-dashboard
ports:
- port: 80
targetPort: 9090
1)Kubernetes 中的 RBAC 支持
在Kubernetes1.6版本中新增角色访问控制机制(Role-Based Access,RBAC)让集群管理员可以针对特定使用者或服务账号的角色,进行更精确的资源访问控制。在RBAC中,权限与角色相关联,用户通过成为适当角色的成员而得到这些角色的权限。
这就极大地简化了权限的管理,在一个组织中,角色是为了完成各种工作而创造,用户则依据它的责任和资格来被指派相应的角色,用户可以很容易地从一个角色被指派到另一个角色。
2)RBAC vs ABAC
鉴权的作用是,决定一个用户是否有权使用 Kubernetes API 做某些事情。它除了会影响 kubectl 等组件之外,还会对一些运行在集群内部并对集群进行操作的软件产生作用,例如使用了 Kubernetes 插件的 Jenkins,或者是利用 Kubernetes API 进行软件部署的 Helm。ABAC 和 RBAC 都能够对访问策略进行配置。
ABAC(Attribute Based Access Control)本来是不错的概念,但是在 Kubernetes 中的实现比较难于管理和理解(怪我咯),而且需要对 Master 所在节点的 SSH 和文件系统权限,而且要使得对授权的变更成功生效,还需要重新启动 API Server。
而 RBAC 的授权策略可以利用 kubectl 或者 Kubernetes API 直接进行配置。RBAC 可以授权给用户,让用户有权进行授权管理,这样就可以无需接触节点,直接进行授权管理。RBAC 在 Kubernetes 中被映射为 API 资源和操作。
因为 Kubernetes 社区的投入和偏好,相对于 ABAC 而言,RBAC 是更好的选择。
3)基础概念
需要理解 RBAC 一些基础的概念和思路,RBAC 是让用户能够访问 Kubernetes API 资源的授权方式。
在 RBAC 中定义了两个对象,用于描述在用户和资源之间的连接权限。
(1)Role and ClusterRole
在 RBAC API 中,Role 表示一组规则权限,权限只会增加(累加权限),不存在一个资源一开始就有很多权限而通过 RBAC 对其进行减少的操作;Role 可以定义在一个 namespace 中,如果想要跨 namespace 则可以创建 ClusterRole。
(2)角色
角色是一系列的权限的集合,例如一个角色可以包含读取 Pod 的权限和列出 Pod 的权限, ClusterRole 跟 Role 类似,但是可以在集群中到处使用。
Role 只能用于授予对单个命名空间中的资源访问权限,以下是一个对默认命名空间中 Pods 具有访问权限的样例:
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
namespace: default
name: pod-reader
rules:
- apiGroups: [""] # "" indicates the core API group
resources: ["pods"]
verbs: ["get", "watch", "list"]
ClusterRole 具有与 Role 相同的权限角色控制能力,不同的是 ClusterRole 是集群级别的,ClusterRole 可以用于:
以下是 ClusterRole 授权某个特定命名空间或全部命名空间(取决于绑定方式)访问 secrets 的样例:
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
# "namespace" omitted since ClusterRoles are not namespaced
name: secret-reader
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "watch", "list"]
(3)RoleBinding and ClusterRoleBinding
RoloBinding 可以将角色中定义的权限授予用户或用户组,RoleBinding 包含一组权限列表(subjects),权限列表中包含有不同形式的待授予权限资源类型(users, groups, or service accounts);RoloBinding 同样包含对被 Bind 的 Role 引用;RoleBinding 适用于某个命名空间内授权,而 ClusterRoleBinding 适用于集群范围内的授权。
RoleBinding 可以在同一命名空间中引用对应的 Role,以下 RoleBinding 样例将 default 命名空间的 pod-reader Role 授予 jane 用户,此后 jane 用户在 default 命名空间中将具有 pod-reader 的权限:
# This role binding allows "jane" to read pods in the "default" namespace.
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: read-pods
namespace: default
subjects:
- kind: User
name: jane
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io
RoleBinding 同样可以引用 ClusterRole 来对当前 namespace 内用户、用户组或 ServiceAccount 进行授权,这种操作允许集群管理员在整个集群内定义一些通用的 ClusterRole,然后在不同的 namespace 中使用 RoleBinding 来引用。
例如,以下 RoleBinding 引用了一个 ClusterRole,这个 ClusterRole 具有整个集群内对 secrets 的访问权限;但是其授权用户 dave 只能访问 development 空间中的 secrets(因为 RoleBinding 定义在 development 命名空间)
# This role binding allows "dave" to read secrets in the "development" namespace.
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: read-secrets
namespace: development # This only grants permissions within the "development" namespace.
subjects:
- kind: User
name: dave
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: secret-reader
apiGroup: rbac.authorization.k8s.io
最后,使用 ClusterRoleBinding 可以对整个集群中的所有命名空间资源权限进行授权;以下 ClusterRoleBinding 样例展示了授权 manager 组内所有用户在全部命名空间中对 secrets 进行访问
# This cluster role binding allows anyone in the "manager" group to read secrets in any namespace.
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: read-secrets-global
subjects:
- kind: Group
name: manager
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: secret-reader
apiGroup: rbac.authorization.k8s.io
4)Referring to Resources
Kubernetes集群内一些资源一般以其名称字符串来表示,这些字符串一般会在 API 的 URL 地址中出现;同时某些资源也会包含子资源,例如 logs 资源就属于 pods 的子资源,API 中 URL 样例如下:
GET /api/v1/namespaces/{namespace}/pods/{name}/log
如果要在 RBAC 授权模型中控制这些子资源的访问权限,可以通过 / 分隔符来实现,以下是一个定义 pods 资资源 logs 访问权限的 Role 定义样例
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
namespace: default
name: pod-and-pod-logs-reader
rules:
- apiGroups: [""]
resources: ["pods", "pods/log"]
verbs: ["get", "list"]
具体的资源引用可以通过 resourceNames 来定义,当指定 get、delete、update、patch 四个动词时,可以控制对其目标资源的相应动作。
以下为限制一个 subject 对名称为 my-configmap 的 configmap 只能具有 get 和 update 权限的样例:
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
namespace: default
name: configmap-updater
rules:
- apiGroups: [""]
resources: ["configmap"]
resourceNames: ["my-configmap"]
verbs: ["update", "get"]
值得注意的是,当设定了 resourceNames 后,verbs 动词不能指定为 list、watch、create 和 deletecollection;因为这个具体的资源名称不在上面四个动词限定的请求 URL 地址中匹配到,最终会因为 URL 地址不匹配导致 Role 无法创建成功。
5)Referring to Subjects
RoleBinding 和 ClusterRoleBinding 可以将 Role 绑定到 Subjects;Subjects 可以是 groups、users 或者 service accounts。
Subjects 中 Users 使用字符串表示,它可以是一个普通的名字字符串,如 “alice”;也可以是 email 格式的邮箱地址,如 “[email protected]”;甚至是一组字符串形式的数字 ID。Users 的格式必须满足集群管理员配置的验证模块,RBAC 授权系统中没有对其做任何格式限定;但是 Users 的前缀 system: 是系统保留的,集群管理员应该确保普通用户不会使用这个前缀格式
Kubernetes 的 Group 信息目前由 Authenticator 模块提供,Groups 书写格式与 Users 相同,都为一个字符串,并且没有特定的格式要求;同样 system: 前缀为系统保留
具有 system:serviceaccount: 前缀的用户名和 system:serviceaccounts: 前缀的组为 Service Accounts
6)Role Binding Examples
以下示例仅展示 RoleBinding 的 subjects 部分。
指定一个名字为 [email protected] 的用户:
subjects:
- kind: User
name: "[email protected]"
apiGroup: rbac.authorization.k8s.io
指定一个名字为 frontend-admins 的组:
subjects:
- kind: Group
name: "frontend-admins"
apiGroup: rbac.authorization.k8s.io
指定 kube-system namespace 中默认的 Service Account:
subjects:
- kind: ServiceAccount
name: default
namespace: kube-system
指定在 qa namespace 中全部的 Service Account:
subjects:
- kind: Group
name: system:serviceaccounts:qa
apiGroup: rbac.authorization.k8s.io
指定全部 namspace 中的全部 Service Account:
subjects:
- kind: Group
name: system:serviceaccounts
apiGroup: rbac.authorization.k8s.io
指定全部的 authenticated 用户(1.5+):
subjects:
- kind: Group
name: system:authenticated
apiGroup: rbac.authorization.k8s.io
指定全部的 unauthenticated 用户(1.5+):
subjects:
- kind: Group
name: system:unauthenticated
apiGroup: rbac.authorization.k8s.io
指定全部用户:
subjects:
- kind: Group
name: system:authenticated
apiGroup: rbac.authorization.k8s.io
- kind: Group
name: system:unauthenticated
apiGroup: rbac.authorization.k8s.io
1)Ingress简介
Kubernetes 暴露服务的方式目前只有三种:LoadBlancer Service、NodePort Service、Ingress。
LoadBlancer Service 是 kubernetes 深度结合云平台的一个组件;当使用 LoadBlancer Service 暴露服务时,实际上是通过向底层云平台申请创建一个负载均衡器来向外暴露服务;目前 LoadBlancer Service 支持的云平台已经相对完善,比如国外的 GCE、DigitalOcean,国内的 阿里云,私有云 Openstack 等等,由于 LoadBlancer Service 深度结合了云平台,所以只能在一些云平台上来使用
NodePort Service 顾名思义,实质上就是通过在集群的每个 node 上暴露一个端口,然后将这个端口映射到某个具体的 service 来实现的,虽然每个 node 的端口有很多(0~65535),但是由于安全性和易用性(服务多了就乱了,还有端口冲突问题)实际使用可能并不多
Ingress 这个东西是 1.2 后才出现的,通过 Ingress 用户可以实现使用 nginx 等开源的反向代理负载均衡器实现对外暴露服务,以下详细说一下 Ingress,毕竟 traefik 用的就是 Ingress
使用 Ingress 时一般会有三个组件:
反向代理负载均衡器
反向代理负载均衡器很简单,说白了就是 nginx、apache 什么的;在集群中反向代理负载均衡器可以自由部署,可以使用 Replication Controller、Deployment、DaemonSet 等等,不过个人喜欢以 DaemonSet 的方式部署,感觉比较方便
Ingress Controller
Ingress Controller 实质上可以理解为是个监视器,Ingress Controller 通过不断地跟 kubernetes API 打交道,实时的感知后端 service、pod 等变化,比如新增和减少 pod,service 增加与减少等;当得到这些变化信息后,Ingress Controller 再结合下文的 Ingress 生成配置,然后更新反向代理负载均衡器,并刷新其配置,达到服务发现的作用
Ingress
Ingress 简单理解就是个规则定义;比如说某个域名对应某个 service,即当某个域名的请求进来时转发给某个 service;这个规则将与 Ingress Controller 结合,然后 Ingress Controller 将其动态写入到负载均衡器配置中,从而实现整体的服务发现和负载均衡
有点懵逼,那就看图:
从上图中可以很清晰的看到,实际上请求进来还是被负载均衡器拦截,比如 nginx,然后 Ingress Controller 通过跟 Ingress 交互得知某个域名对应哪个 service,再通过跟 kubernetes API 交互得知 service 地址等信息;综合以后生成配置文件实时写入负载均衡器,然后负载均衡器 reload 该规则便可实现服务发现,即动态映射。
了解了以上内容以后,这也就很好的说明了我为什么喜欢把负载均衡器部署为 Daemon Set;因为无论如何请求首先是被负载均衡器拦截的,所以在每个 node 上都部署一下,同时 hostport 方式监听 80 端口;那么就解决了其他方式部署不确定 负载均衡器在哪的问题,同时访问每个 node 的 80 都能正确解析请求;如果前端再 放个 nginx 就又实现了一层负载均衡。
可能从大致印象上Ingress就是能利用 Nginx、Haproxy 啥的负载均衡器暴露集群内服务的工具;那么问题来了,集群内服务想要暴露出去面临着几个问题:
Pod 漂移问题
众所周知 Kubernetes 具有强大的副本控制能力,能保证在任意副本(Pod)挂掉时自动从其他机器启动一个新的,还可以动态扩容等,总之一句话,这个 Pod 可能在任何时刻出现在任何节点上,也可能在任何时刻死在任何节点上;那么自然随着 Pod 的创建和销毁,Pod IP 肯定会动态变化;那么如何把这个动态的 Pod IP 暴露出去?这里借助于 Kubernetes 的 Service 机制,Service 可以以标签的形式选定一组带有指定标签的 Pod,并监控和自动负载他们的 Pod IP,那么我们向外暴露只暴露 Service IP 就行了;这就是 NodePort 模式:即在每个节点上开起一个端口,然后转发到内部 Service IP 上,如下图所示:
端口管理问题
采用 NodePort 方式暴露服务面临一个坑爹的问题是,服务一旦多起来,NodePort 在每个节点上开启的端口会及其庞大,而且难以维护;这时候引出的思考问题是 “能不能使用 Nginx 啥的只监听一个端口,比如 80,然后按照域名向后转发?” 这思路很好,简单的实现就是使用 DaemonSet 在每个 node 上监听 80,然后写好规则,因为 Nginx 外面绑定了宿主机 80 端口(就像 NodePort),本身又在集群内,那么向后直接转发到相应 Service IP 就行了,如下图所示:
域名分配及动态更新问题
从上面的思路,采用 Nginx 似乎已经解决了问题,但是其实这里面有一个很大缺陷:每次有新服务加入怎么改 Nginx 配置?总不能手动改或者来个 Rolling Update 前端 Nginx Pod 吧?这时候 “伟大而又正直勇敢的” Ingress 登场,如果不算上面的 Nginx,Ingress 只有两大组件:Ingress Controller 和 Ingress
Ingress 这个玩意,简单的理解就是 你原来要改 Nginx 配置,然后配置各种域名对应哪个 Service,现在把这个动作抽象出来,变成一个 Ingress 对象,你可以用 yml 创建,每次不要去改 Nginx 了,直接改 yml 然后创建/更新就行了;那么问题来了:”Nginx 咋整?”
Ingress Controller 这东西就是解决 “Nginx 咋整” 的;Ingress Controoler 通过与 Kubernetes API 交互,动态的去感知集群中 Ingress 规则变化,然后读取他,按照他自己模板生成一段 Nginx 配置,再写到 Nginx Pod 里,最后 reload 一下,工作流程如下图:
当然在实际应用中,最新版本 Kubernetes 已经将 Nginx 与 Ingress Controller 合并为一个组件,所以 Nginx 无需单独部署,只需要部署 Ingress Controller 即可。
2)Nginx Ingress
上面啰嗦了那么多,只是为了讲明白 Ingress 的各种理论概念,下面实际部署很简单。
(1)配置 ingress RBAC
cat nginx-ingress-controller-rbac.yml
#apiVersion: v1
#kind: Namespace
#metadata:
# name: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: nginx-ingress-serviceaccount
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: nginx-ingress-clusterrole
rules:
- apiGroups:
- ""
resources:
- configmaps
- endpoints
- nodes
- pods
- secrets
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- apiGroups:
- ""
resources:
- services
verbs:
- get
- list
- watch
- apiGroups:
- "extensions"
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- apiGroups:
- "extensions"
resources:
- ingresses/status
verbs:
- update
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: Role
metadata:
name: nginx-ingress-role
namespace: kube-system
rules:
- apiGroups:
- ""
resources:
- configmaps
- pods
- secrets
- namespaces
verbs:
- get
- apiGroups:
- ""
resources:
- configmaps
resourceNames:
# Defaults to "-"
# Here: "-"
# This has to be adapted if you change either parameter
# when launching the nginx-ingress-controller.
- "ingress-controller-leader-nginx"
verbs:
- get
- update
- apiGroups:
- ""
resources:
- configmaps
verbs:
- create
- apiGroups:
- ""
resources:
- endpoints
verbs:
- get
- create
- update
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
name: nginx-ingress-role-nisa-binding
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: nginx-ingress-role
subjects:
- kind: ServiceAccount
name: nginx-ingress-serviceaccount
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: nginx-ingress-clusterrole-nisa-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: nginx-ingress-clusterrole
subjects:
- kind: ServiceAccount
name: nginx-ingress-serviceaccount
namespace: kube-system
(2)部署默认后端
我们知道 前端的 Nginx 最终要负载到后端 service 上,那么如果访问不存在的域名咋整?官方给出的建议是部署一个 默认后端,对于未知请求全部负载到这个默认后端上;这个后端啥也不干,就是返回 404,部署如下:
kubectl create -f default-backend.yaml
这个 default-backend.yaml 文件可以在 github Ingress 仓库 找到. 针对官方配置 我们单独添加了 nodeselector 指定,绑定LB地址 以方便DNS 做解析。
cat default-backend.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: default-http-backend
labels:
k8s-app: default-http-backend
namespace: kube-system
spec:
replicas: 1
template:
metadata:
labels:
k8s-app: default-http-backend
spec:
terminationGracePeriodSeconds: 60
containers:
- name: default-http-backend
# Any image is permissable as long as:
# 1. It serves a 404 page at /
# 2. It serves 200 on a /healthz endpoint
image: harbor-demo.dianrong.com/kubernetes/defaultbackend:1.0
livenessProbe:
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 5
ports:
- containerPort: 8080
resources:
limits:
cpu: 10m
memory: 20Mi
requests:
cpu: 10m
memory: 20Mi
nodeSelector:
kubernetes.io/hostname: 172.16.200.209
---
apiVersion: v1
kind: Service
metadata:
name: default-http-backend
namespace: kube-system
labels:
k8s-app: default-http-backend
spec:
ports:
- port: 80
targetPort: 8080
selector:
k8s-app: default-http-backend
(3)部署Ingress Controller
部署完了后端就得把最重要的组件 Nginx+Ingres Controller(官方统一称为 Ingress Controller) 部署上。
kubectl create -f nginx-ingress-controller.yaml
注意: 官方的 Ingress Controller 有个坑,默认注释了hostNetwork 工作方式。以防止端口的在宿主机的冲突。没有绑定到宿主机 80 端口,也就是说前端 Nginx 没有监听宿主机 80 端口(这还玩个卵啊)。
所以需要把配置搞下来自己加一下 hostNetwork:
cat nginx-ingress-controller.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx-ingress-controller
labels:
k8s-app: nginx-ingress-controller
namespace: kube-system
spec:
replicas: 1
template:
metadata:
labels:
k8s-app: nginx-ingress-controller
spec:
# hostNetwork makes it possible to use ipv6 and to preserve the source IP correctly regardless of docker configuration
# however, it is not a hard dependency of the nginx-ingress-controller itself and it may cause issues if port 10254 already is taken on the host
# that said, since hostPort is broken on CNI (https://github.com/kubernetes/kubernetes/issues/31307) we have to use hostNetwork where CNI is used
# like with kubeadm
# hostNetwork: true
terminationGracePeriodSeconds: 60
hostNetwork: true
serviceAccountName: nginx-ingress-serviceaccount
containers:
- image: harbor-demo.dianrong.com/kubernetes/nginx-ingress-controller:0.9.0-beta.1
name: nginx-ingress-controller
readinessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
livenessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
timeoutSeconds: 1
ports:
- containerPort: 80
hostPort: 80
- containerPort: 443
hostPort: 443
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-http-backend
# - --default-ssl-certificate=$(POD_NAMESPACE)/ingress-secret
nodeSelector:
kubernetes.io/hostname: 172.16.200.102
(4)部署 Ingress
从上面可以知道 Ingress 就是个规则,指定哪个域名转发到哪个 Service,所以说首先我们得有个 Service,当然 Service 去哪找这里就不管了;这里默认为已经有了两个可用的 Service,以下以 Dashboard 为例。
先写一个 Ingress 文件,语法格式啥的请参考 官方文档,由于我的 Dashboard 都在kube-system 这个命名空间,所以要指定 namespace:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: dashboard-ingress
namespace: kube-system
annotations:
kubernetes.io/ingress.class: "nginx"
spec:
rules:
- host: fox-dashboard.dianrong.com
http:
paths:
- backend:
serviceName: kubernetes-dashboard
servicePort: 80
装逼成功截图如下:
(5)部署 Ingress TLS
上面已经搞定了 Ingress,下面就顺便把 TLS 怼上;官方给出的样例很简单,大致步骤就两步:创建一个含有证书的 secret、在 Ingress 开启证书;但是我不得不喷一下,文档就提那么一嘴,大坑一堆,比如多域名配置,TLS功能的启动都没。启用tls 需要在 nginx-ingress-controller添加参数,上面的controller以配置好。
--default-ssl-certificate=$(POD_NAMESPACE)/ingress-secret
(6)证书格式转换
创建secret 需要使用你的证书文件,官方要求证书的编码需要使用base64。转换方法如下:
证书转换pem 格式:
openssl x509 -inform DER -in cert/doamin.crt -outform PEM -out cert/domain.pem
证书编码转换base64:
cat domain.crt | base64 > domain.crt.base64
(7)创建 secret
需要使用base64 编码格式证书:
cat ingress-secret.yml
apiVersion: v1
data:
tls.crt: LS0tLS1CRU
tls.key: LS0tLS1CRU
kind: Secret
metadata:
name: ingress-secret
namespace: kube-system
type: Opaque
其实这个配置比如证书转码啥的没必要手动去做,可以直接使用下面的命令创建,kubectl 将自动为我们完整格式的转换。
kubectl create secret tls ingress-secret --key certs/ttlinux.com.cn-key.pem --cert certs/ttlinux.com.cn.pem
(8)重新部署 Ingress
生成完成后需要在 Ingress 中开启 TLS,Ingress 修改后如下:
cat dashboard-ingress.yml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: dashboard-ingress
namespace: kube-system
annotations:
kubernetes.io/ingress.class: "nginx"
spec:
tls:
- hosts:
- fox-dashboard.dianrong.com
secretName: ingress-secret
rules:
- host: fox-dashboard.dianrong.com
http:
paths:
- backend:
serviceName: kubernetes-dashboard
servicePort: 80
注意:一个 Ingress 只能使用一个 secret(secretName 段只能有一个),也就是说只能用一个证书,更直白的说就是如果你在一个 Ingress 中配置了多个域名,那么使用 TLS 的话必须保证证书支持该 Ingress 下所有域名;并且这个 secretName 一定要放在上面域名列表最后位置,否则会报错 did not find expected key 无法创建;同时上面的 hosts 段下域名必须跟下面的 rules 中完全匹配。
更需要注意一点:之所以这里单独开一段就是因为有大坑;Kubernetes Ingress 默认情况下,当你不配置证书时,会默认给你一个 TLS 证书的,也就是说你 Ingress 中配置错了,比如写了2个 secretName、或者 hosts 段中缺了某个域名,那么对于写了多个 secretName 的情况,所有域名全会走默认证书;对于 hosts 缺了某个域名的情况,缺失的域名将会走默认证书,部署时一定要验证一下证书,不能 “有了就行”;更新 Ingress 证书可能需要等一段时间才会生效。
最后重新部署一下即可:
kubectl delete -f dashboard-ingress.yml
kubectl create -f dashboard-ingress.yml
部署 TLS 后 80 端口会自动重定向到 443,最终访问截图如下:
3)ingress 高级用法
总结:
1)必要的环境
Kubernetes集群,版本1.4以上
相关镜像准备:
gcr.io/google_containers/kube-state-metrics:v0.5.0
prom/prometheus:v1.7.0
prom/node-exporter:v0.14.0
giantswarm/tiny-tools
dockermuenster/caddy:0.9.3
grafana/grafana:4.2.0
quay.io/prometheus/alertmanager:v0.7.1
将上述镜像下载到本地后,使用docker load命令加载到Kubernetes每台Node节点上。
例如上传heapster-grafana-amd64_v4_2_0.tar.gz镜像到k8s-master和k8s-node节点,然后手动解压:
docker load -i heapster-grafana-amd64_v5_0_4.tar.gz
2)监控组件安装
下载Prometheus的部署文件::prometheus配置文件下载,使用kubectl命令创建Prometheus各个组件:
kubectl create -f manifests-all.yaml
执行完成后,Kubernetes集群中会出现一个新的Namespace: "monitoring",里面有很多组件:
kubectl get pods,svc,deployment,job,daemonset,ingress --namespace=monitoring
NAME READY STATUS RESTARTS AGE
po/alertmanager-3874563995-fqvet 1/1 Running 1 1d
po/grafana-core-741762473-exne3 1/1 Running 1 2d
po/kube-state-metrics-1381605391-hiqti 1/1 Running 1 1d
po/kube-state-metrics-1381605391-j11e6 1/1 Running 1 2d
po/node-directory-size-metrics-0abcp 2/2 Running 2 16d
po/node-directory-size-metrics-6xmzk 2/2 Running 2 16d
po/node-directory-size-metrics-d5cka 2/2 Running 2 16d
po/node-directory-size-metrics-ojo1x 2/2 Running 2 16d
po/node-directory-size-metrics-rdvn8 2/2 Running 2 16d
po/node-directory-size-metrics-tfqox 2/2 Running 2 16d
po/node-directory-size-metrics-wkec1 2/2 Running 2 16d
po/prometheus-core-4080573952-vu2dg 1/1 Running 49 1d
po/prometheus-node-exporter-1dnvp 1/1 Running 1 16d
po/prometheus-node-exporter-64763 1/1 Running 1 16d
po/prometheus-node-exporter-6h6u0 1/1 Running 1 16d
po/prometheus-node-exporter-i29ic 1/1 Running 1 16d
po/prometheus-node-exporter-i6mvh 1/1 Running 1 16d
po/prometheus-node-exporter-lxqou 1/1 Running 1 16d
po/prometheus-node-exporter-n1n8y 1/1 Running 1 16d
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
svc/alertmanager 192.168.3.247 9093/TCP 16d
svc/grafana 192.168.3.89 3000/TCP 16d
svc/kube-state-metrics 192.168.3.78 8080/TCP 16d
svc/prometheus 192.168.3.174 9090/TCP 16d
svc/prometheus-node-exporter None 9100/TCP 16d
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deploy/alertmanager 1 1 1 1 16d
deploy/grafana-core 1 1 1 1 16d
deploy/kube-state-metrics 2 2 2 2 16d
deploy/prometheus-core 1 1 1 1 16d
NAME DESIRED SUCCESSFUL AGE
jobs/grafana-import-dashboards 1 1 16d
NAME DESIRED CURRENT NODE-SELECTOR AGE
ds/node-directory-size-metrics 7 7 16d
ds/prometheus-node-exporter 7 7 16d
NAME HOSTS ADDRESS PORTS AGE
ing/grafana grafana.yeepay.com 80 16d
manifests-all.yaml文件中使用30161和30162的Nodeport端口作为Grafana和Prometheus Web界面的访问端口。
3)登录Grafana
通过http://${Your_API_SERVER_IP}:30161/登录Grafana,默认的用户名和密码都是admin。
4)添加数据源
点击左上角图标,找到DataSource选项,添加数据源:
5)添加Prometheus的数据源
将Prometheus的作为数据源的相关参数如下图所示:
点击Save & Test按钮,保存数据源。
6)监控Kubernetes集群
导入模板文件:
点击左上角Grafana的图标,选在Dashboard选项。
可以从这里下载各种监控模板,然后使用Upload到Grafana。
也可搜索:Dashboards | Grafana Labs
直接导入 node_exporter.json 监控模板,这个可以把 node 节点指标显示出来,也可直接导入 docker_rev1.json,这个可以把容器资源指标显示出来了。
Pod级别资源监控实例:
1)Helm简介
微服务和容器化给复杂应用部署与管理带来了极大的挑战。Helm是目前Kubernetes服务编排领域的唯一开源子项目,做为Kubernetes应用的一个包管理工具,可理解为Kubernetes的apt-get / yum,由Deis 公司发起,该公司已经被微软收购。Helm通过软件打包的形式,支持发布的版本管理和控制,很大程度上简化了Kubernetes应用部署和管理的复杂性。
随着业务容器化与向微服务架构转变,通过分解巨大的单体应用为多个服务的方式,分解了单体应用的复杂性,使每个微服务都可以独立部署和扩展,实现了敏捷开发和快速迭代和部署。但任何事情都有两面性,虽然微服务给我们带来了很多便利,但由于应用被拆分成多个组件,导致服务数量大幅增加,对于Kubernetest编排来说,每个组件有自己的资源文件,并且可以独立的部署与伸缩,这给采用Kubernetes做应用编排带来了诸多挑战:
Helm把Kubernetes资源(比如deployments、services或 ingress等) 打包到一个chart中,而chart被保存到chart仓库。通过chart仓库可用来存储和分享chart。Helm使发布可配置,支持发布应用配置的版本管理,简化了Kubernetes部署应用的版本控制、打包、发布、删除、更新等操作。
本文简单介绍了Helm的用途、架构与实现。
Helm基本架构如下:
2)Helm安装
完本文后您应该可以自己创建chart,并创建自己的私有chart仓库。
Helm是一个kubernetes应用的包管理工具,用来管理charts——预先配置好的安装包资源,有点类似于Ubuntu的APT和CentOS中的yum。
Helm chart是用来封装kubernetes原生应用程序的yaml文件,可以在你部署应用的时候自定义应用程序的一些metadata,便与应用程序的分发。
Helm和charts的主要作用:
安装步骤:
首先需要安装helm客户端:
curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get > get_helm.shchmod 700 get_helm.sh./get_helm.sh
创建tiller的serviceaccount和clusterrolebinding:
kubectl create serviceaccount --namespace kube-system tiller
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
然后安装helm服务端tiller:
helm init -i harbor-demo.dianrong.com/k8s/kubernetes-helm-tiller:v2.7.0 --service-account tiller
我们使用-i指定自己的镜像,因为官方的镜像因为某些原因无法拉取。
为应用程序设置serviceAccount:
kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}'
检查是否安装成功:
$ kubectl -n kube-system get pods|grep tiller
tiller-deploy-3243657295-4dg28 1/1 Running 0 8m
# helm version
Client: &version.Version{SemVer:"v2.7.0", GitCommit:"08c1144f5eb3e3b636d9775617287cc26e53dba4", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.7.0", GitCommit:"08c1144f5eb3e3b636d9775617287cc26e53dba4", GitTreeState:"clean"}
注意检查是否安装socat依赖 :
yum install socat
3)创建自己的chart
我们创建一个名为mychart的chart,看一看chart的文件结构:
# helm create mychart
# tree mychart/
mychart/
├── charts
├── Chart.yaml
├── templates
│ ├── deployment.yaml
│ ├── _helpers.tpl
│ ├── ingress.yaml
│ ├── NOTES.txt
│ └── service.yaml
└── values.yaml
2 directories, 7 files
Templates目录下是yaml文件的模板,遵循Go template语法。使用过Hugo的静态网站生成工具的人应该对此很熟悉。
我们查看下deployment.yaml文件的内容:
# cat mychart/templates/deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: {{ template "mychart.fullname" . }}
labels:
app: {{ template "mychart.name" . }}
chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
spec:
replicas: {{ .Values.replicaCount }}
template:
metadata:
labels:
app: {{ template "mychart.name" . }}
release: {{ .Release.Name }}
spec:
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- containerPort: {{ .Values.service.internalPort }}
livenessProbe:
httpGet:
path: /
port: {{ .Values.service.internalPort }}
readinessProbe:
httpGet:
path: /
port: {{ .Values.service.internalPort }}
resources:
{{ toYaml .Values.resources | indent 12 }}
{{- if .Values.nodeSelector }}
nodeSelector:
{{ toYaml .Values.nodeSelector | indent 8 }}
{{- end }}
这是该应用的Deployment的yaml配置文件,其中的双大括号包扩起来的部分是Go template,其中的Values是在values.yaml文件中定义的:
cat mychart/values.yaml
# Default values for mychart.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.
replicaCount: 1
image:
repository: nginx
tag: 1.9
pullPolicy: IfNotPresent
service:
name: nginx
type: ClusterIP
externalPort: 80
internalPort: 80
ingress:
enabled: false
# Used to create an Ingress record.
hosts:
- chart-example.local
annotations:
# kubernetes.io/ingress.class: nginx
# kubernetes.io/tls-acme: "true"
tls:
# Secrets must be manually created in the namespace.
# - secretName: chart-example-tls
# hosts:
# - chart-example.local
resources:
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi
比如在Deployment.yaml中定义的容器镜像image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"其中的:
以上两个变量值是在create chart的时候自动生成的默认值。
我们将默认的镜像地址和tag改成我们自己的镜像sz-pg-oam-docker-hub-001.tendcloud.com/library/nginx:1.9。
检查配置和模板是否有效:
当使用kubernetes部署应用的时候实际上讲templates渲染成最终的kubernetes能够识别的yaml格式。
helm install --dry-run --debug -n string
命令来验证chart配置:
该输出中包含了模板的变量配置与最终渲染的yaml文件。
# helm install --dry-run --debug mychart/ --name test
[debug] Created tunnel using local port: '45443'
[debug] SERVER: "localhost:45443"
[debug] Original chart version: ""
[debug] CHART PATH: /root/k8s/helm/mychart
NAME: test
REVISION: 1
RELEASED: Fri Nov 3 13:37:44 2017
CHART: mychart-0.1.0
USER-SUPPLIED VALUES:
{}
COMPUTED VALUES:
image:
pullPolicy: IfNotPresent
repository: nginx
tag: 1.9
ingress:
annotations: null
enabled: false
hosts:
- chart-example.local
tls: null
replicaCount: 1
resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi
service:
externalPort: 80
internalPort: 80
name: nginx
type: ClusterIP
HOOKS:
MANIFEST:
---
# Source: mychart/templates/service.yaml
apiVersion: v1
kind: Service
metadata:
name: test-mychart
labels:
app: mychart
chart: mychart-0.1.0
release: test
heritage: Tiller
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 80
protocol: TCP
name: nginx
selector:
app: mychart
release: test
---
# Source: mychart/templates/deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: test-mychart
labels:
app: mychart
chart: mychart-0.1.0
release: test
heritage: Tiller
spec:
replicas: 1
template:
metadata:
labels:
app: mychart
release: test
spec:
containers:
- name: mychart
image: "nginx:1.9"
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
livenessProbe:
httpGet:
path: /
port: 80
readinessProbe:
httpGet:
path: /
port: 80
resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi
我们可以看到Deployment和Service的名字前半截由两个随机的单词组成,最后才是我们在values.yaml中配置的值。
4)部署到kubernetes
在mychart目录下执行下面的命令将nginx部署到kubernetes集群上:
# helm install . --name test
NAME: test
LAST DEPLOYED: Fri Nov 3 14:30:19 2017
NAMESPACE: default
STATUS: DEPLOYED
RESOURCES:
==> v1/Service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
test-mychart ClusterIP 10.254.242.188 80/TCP 0s
==> v1beta1/Deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
test-mychart 1 0 0 0 0s
NOTES:
1. Get the application URL by running these commands:
export POD_NAME=$(kubectl get pods --namespace default -l "app=mychart,release=test" -o jsonpath="{.items[0].metadata.name}")
echo "Visit http://127.0.0.1:8080 to use your application"
kubectl port-forward $POD_NAME 8080:80
现在nginx已经部署到kubernetes集群上,本地执行提示中的命令在本地主机上访问到nginx实例:
export POD_NAME=$(kubectl get pods --namespace default -l "app=eating-hound-mychart" -o jsonpath="{.items[0].metadata.name}")
echo "Visit http://127.0.0.1:8080 to use your application"
kubectl port-forward $POD_NAME 8080:80
在本地访问http://127.0.0.1:8080即可访问到nginx。
查看部署的relaese:
# helm list
NAME REVISION UPDATED STATUS CHART NAMESPACE
peeking-yak 1 Fri Nov 3 15:01:20 2017 DEPLOYED mychart-0.1.0 default
删除部署的release:
$ helm delete eating-houndrelease "eating-hound" deleted
5)打包分享
我们可以修改Chart.yaml中的helm chart配置信息,然后使用下列命令将chart打包成一个压缩文件:
helm package .
打包出mychart-0.1.0.tgz文件。
创建filebeat configmap 配置:
kubectl create configmap filebeat-shipper-logs-config --from-file=filebeat_shipper_logs.yml -n noah-demo
kubectl create configmap filebeat-config --from-file=filebeat.yml -n noah-demo
configmap 创建注意事项:
挂载目录下的文件名称,即为cm定义里的key
值。(映射目录)
kubectl create configmap filebeat-shipper-logs-config --from-file=filebeat-shipper-logs-config=filebeat_shipper_logs.yml -n noah-demo
挂载目录下的文件的内容,即为cm定义里的value
值。value可以多行定义,这在一些稍微复杂的场景下特别有用,比如 my.cnf。
kubectl create configmap filebeat-shipper-logs-config --from-file=filebeat_shipper_logs.yml -n noah-demo
bin/kafka-console-consumer.sh --zookeeper 10.18.19.56:2181,10.18.19.57:2181,10.18.19.58:2181 --topic amc_alert_lists --from-beginning
从 github release 页面,下载最新版本。
1)准备
当前Kubernetes 1.8的小版本是1.8.7。 在升级之前一定要多读几遍官方的升级须知Kubernetes 1.8 - Action Required Before Upgrading。其中和我们相关的:
2)使用ansible升级Kubernetes核心组件
接下来尝试使用ansible将Kubernetes的核心组件从1.7升级到1.8。 直接使用二进制包覆盖原有路径。
ansible kube-node -m unarchive -a 'src=/root/k8s/k8s-v1.8.7/kubernetes-server-linux-amd64.tar.gz dest=/usr/local/'
3)生产使用建议关闭 swap
Kubernetes 1.8开始要求关闭系统的Swap,如果不关闭,默认配置下kubelet将无法启动。可以通过kubelet的启动参数--fail-swap-on=false更改这个限制。 我们这里关闭系统的Swap:
swapoff -a
修改 /etc/fstab 文件,注释掉 SWAP 的自动挂载,使用free -m确认swap已经关闭。
swappiness参数调整,修改/etc/sysctl.d/k8s.conf添加下面一行:
vm.swappiness=0
4)重启kubelet服务
systemctl restart kubelet.service
检查node 是否升级成功:
[root@cd-k8s-master k8s-v1.8.7]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
172.16.200.206 Ready 121d v1.8.7
172.16.200.209 Ready 171d v1.8.7
172.16.200.216 Ready 126d v1.8.7
部署环境依赖:如果使用kubeadmin方式部署K8S。因为apiserver 使用docker方式。默认镜像不带有ceph-common 客户端驱动。通过部署rbd-provisioner,手动加载驱动方式解决此问题。
创建一个rbd-provisioner.yaml 驱动:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: rbd-provisioner
namespace: monitoring
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
app: rbd-provisioner
spec:
containers:
- name: rbd-provisioner
image: "quay.io/external_storage/rbd-provisioner:latest"
env:
- name: PROVISIONER_NAME
value: ceph.com/rbd
args: ["-master=http://10.18.19.98:8080", "-id=rbd-provisioner"]
1)生成Ceph secret
使用 Ceph 管理员提供给你的 ceph.client.admin.keyring 文件,我们将它放在了 /etc/ceph 目录下,用来生成 secret:
grep key /etc/ceph/ceph.client.admin.keyring |awk '{printf "%s", $NF}'|base64
2)创建Ceph secret
apiVersion: v1
kind: Secret
metadata:
name: ceph-secret
namespace: monitoring
type: "kubernetes.io/rbd"
data:
key: QVFCZU54dFlkMVNvRUJBQUlMTUVXMldSS29mdWhlamNKaC8yRXc9PQ==
3)创建StorageClass
二进制部署方式参考,ceph-class.yaml 文件内容为:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: prometheus-ceph
namespace: noah
provisioner: ceph.com/rbd
parameters:
monitors: 10.18.19.91:6789
adminId: admin
adminSecretName: ceph-secret
adminSecretNamespace: monitoring
userSecretName: ceph-secret
pool: prometheus #建立自己的RBD存储池
userId: admin
调用rbd-provisioner,参考以下内容:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: kong-cassandra-fast
namespace: monitoring
provisioner: ceph.com/rbd #调用rbd-provisione
parameters:
monitors: 10.18.19.91:6789
adminId: admin
adminSecretName: ceph-secret
adminSecretNamespace: monitoring
userSecretName: ceph-secret
pool: prometheus
userId: admin
5)列出所有的pool
ceph osd lspools
6)列出pool中的所有镜像
rbd ls prometheus
7)创建pool
ceph osd pool create prometheus 128 128
配置 prometheus,添加ceph class。
配置文件如下:
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: prometheus-core
namespace: monitoring
labels:
app: prometheus
component: core
version: v1
spec:
serviceName: prometheus-core
replicas: 1
template:
metadata:
labels:
app: prometheus
component: core
spec:
serviceAccountName: prometheus-k8s
containers:
- name: prometheus
image: prom/prometheus:v1.7.0
args:
- '-storage.local.retention=336h'
- '-storage.local.memory-chunks=1048576'
- '-config.file=/etc/prometheus/prometheus.yaml'
- '-alertmanager.url=http://alertmanager:9093/'
ports:
- name: webui
containerPort: 9090
resources:
requests:
cpu: 2
memory: 2Gi
limits:
cpu: 2
memory: 2Gi
volumeMounts:
- name: config-volume
mountPath: /etc/prometheus
- name: rules-volume
mountPath: /etc/prometheus-rules
- name: data
mountPath: /prometheus/data
volumes:
- name: config-volume
configMap:
name: prometheus-core
- name: rules-volume
configMap:
name: prometheus-rules
volumeClaimTemplates:
- metadata:
name: data
annotations:
volume.beta.kubernetes.io/storage-class: "ceph-aliyun"
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 50Gi
1)Kubernete CI/CD简介
持续构建与发布是我们日常工作中必不可少的一个步骤,目前大多公司都采用 Jenkins 集群来搭建符合需求的 CI/CD 流程,然而传统的 Jenkins Slave 一主多从方式会存在一些痛点,比如:主 Master 发生单点故障时,整个流程都不可用了;每个 Slave 的配置环境不一样,来完成不同语言的编译打包等操作,但是这些差异化的配置导致管理起来非常不方便,维护起来也是比较费劲;资源分配不均衡,有的 Slave 要运行的 job 出现排队等待,而有的 Slave 处于空闲状态;最后资源有浪费,每台 Slave 可能是实体机或者 VM,当 Slave 处于空闲状态时,也不会完全释放掉资源。
提到基于Kubernete的CI/CD,可以使用的工具有很多,比如Jenkins、Gitlab CI已经新兴的drone之类的,我们这里会使用大家最为熟悉的Jenins来做CI/CD的工具。
优点:
Jenkins 安装完成了,接下来我们不用急着就去使用,我们要了解下在 Kubernetes 环境下面使用 Jenkins 有什么好处。
我们知道持续构建与发布是我们日常工作中必不可少的一个步骤,目前大多公司都采用 Jenkins 集群来搭建符合需求的 CI/CD 流程,然而传统的 Jenkins Slave 一主多从方式会存在一些痛点,比如:
正因为上面的这些种种痛点,我们渴望一种更高效更可靠的方式来完成这个 CI/CD 流程,而 Docker 虚拟化容器技术能很好的解决这个痛点,又特别是在 Kubernetes 集群环境下面能够更好来解决上面的问题,下图是基于 Kubernetes 搭建 Jenkins 集群的简单示意图
从图上可以看到 Jenkins Master 和 Jenkins Slave 以 Pod 形式运行在 Kubernetes 集群的 Node 上,Master 运行在其中一个节点,并且将其配置数据存储到一个 Volume 上去,Slave 运行在各个节点上,并且它不是一直处于运行状态,它会按照需求动态的创建并自动删除。
这种方式的工作流程大致为:当 Jenkins Master 接受到 Build 请求时,会根据配置的 Label 动态创建一个运行在 Pod 中的 Jenkins Slave 并注册到 Master 上,当运行完 Job 后,这个 Slave 会被注销并且这个 Pod 也会自动删除,恢复到最初状态。
那么我们使用这种方式带来了哪些好处呢?
是不是以前我们面临的种种问题在 Kubernetes 集群环境下面是不是都没有了啊?看上去非常完美。
2)安装
我们这里就不再去详细讲述什么是 Jenkins了,直接进入正题。既然要基于Kubernetes来做CI/CD,当然我们这里需要将 Jenkins 安装到 Kubernetes 集群当中,新建一个 Deployment。
jenkins-deployment.yaml:
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: jenkins2
namespace: kube-ops
spec:
template:
metadata:
labels:
app: jenkins2
spec:
terminationGracePeriodSeconds: 10
serviceAccountName: jenkins2
containers:
- name: jenkins
image: jenkins/jenkins:lts
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
name: web
protocol: TCP
- containerPort: 50000
name: agent
protocol: TCP
resources:
limits:
cpu: 2000m
memory: 4Gi
requests:
cpu: 1000m
memory: 2Gi
livenessProbe:
httpGet:
path: /login
port: 8080
initialDelaySeconds: 60
timeoutSeconds: 5
failureThreshold: 12
readinessProbe:
httpGet:
path: /login
port: 8080
initialDelaySeconds: 60
timeoutSeconds: 5
failureThreshold: 12
volumeMounts:
- name: jenkinshome
subPath: jenkins2
mountPath: /var/jenkins_home
env:
- name: LIMITS_MEMORY
valueFrom:
resourceFieldRef:
resource: limits.memory
divisor: 1Mi
- name: JAVA_OPTS
value: -Xmx$(LIMITS_MEMORY)m -XshowSettings:vm -Dhudson.slaves.NodeProvisioner.initialDelay=0 -Dhudson.slaves.NodeProvisioner.MARGIN=50 -Dhudson.slaves.NodeProvisioner.MARGIN0=0.85 -Duser.timezone=Asia/Shanghai
securityContext:
fsGroup: 1000
volumes:
- name: jenkinshome
persistentVolumeClaim:
claimName: opspvc
---
apiVersion: v1
kind: Service
metadata:
name: jenkins2
namespace: kube-ops
labels:
app: jenkins2
spec:
selector:
app: jenkins2
ports:
- name: web
port: 8080
targetPort: web
- name: agent
port: 50000
targetPort: agent
为了方便演示,我们把本节课所有的对象资源都放置在一个名为 kube-ops 的 namespace 下面,所以我们需要添加创建一个 namespace:
kubectl create namespace kube-ops
我们这里使用一个名为 jenkins/jenkins:lts 的镜像,这是 jenkins 官方的 Docker 镜像,然后也有一些环境变量,当然我们也可以根据自己的需求来定制一个镜像,比如我们可以将一些插件打包在自定义的镜像当中,可以参考文档:https://github.com/jenkinsci/docker
我们这里使用默认的官方镜像就行,另外一个还需要注意的是我们将容器的 /var/jenkins_home目录挂载到了一个名为 opspvc 的 PVC 对象上面,所以我们同样还得提前创建一个对应的 PVC 对象,当然我们也可以使用我们前面的 StorageClass 对象来自动创建。
jenkins-pvc.yaml:
apiVersion: v1
kind: PersistentVolume
metadata:
name: opspv
spec:
capacity:
storage: 200Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Delete
nfs:
server: 10.34.11.12
path: /opt/nfs
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: opspvc
namespace: kube-ops
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 200Gi
创建需要用到的 PVC 对象:
$ kubectl create -f jenkins-pvc.yaml
另外我们这里还需要使用到一个拥有相关权限的 serviceAccount:jenkins2,我们这里只是给 jenkins 赋予了一些必要的权限,当然如果你对 serviceAccount 的权限不是很熟悉的话,我们给这个 sa 绑定一个 cluster-admin 的集群角色权限也是可以的,当然这样具有一定的安全风险。jenkins-rbac.yaml:
apiVersion: v1
kind: ServiceAccount
metadata:
name: jenkins2
namespace: kube-ops
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: jenkins2
namespace: kube-ops
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["create","delete","get","list","patch","update","watch"]
- apiGroups: [""]
resources: ["pods/exec"]
verbs: ["create","delete","get","list","patch","update","watch"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get","list","watch"]
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
name: jenkins2
namespace: kube-ops
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: jenkins2
subjects:
- kind: ServiceAccount
name: jenkins2
namespace: kube-ops
创建 rbac 相关的资源对象:
$ kubectl create -f jenkins-rbac.yaml
最后为了方便我们测试,我们这里通过 ingress的形式来访问Jenkins 的 web 服务,Jenkins 服务端口为8080,50000 端口为agent,这个端口主要是用于 Jenkins 的 master 和 slave 之间通信使用的。
jenkins-ingress.yaml:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: jenkins-ingress
namespace: kube-ops
annotations:
kubernetes.io/ingress.class: "nginx"
spec:
rules:
- host: jenkins-k8s.jiedai361.com
http:
paths:
- backend:
serviceName: jenkins2
servicePort: 8080
一切准备的资源准备好过后,我们直接创建 Jenkins 服务:
kubectl create -f jenkins-deployment.yaml
最后创建ingress 路由服务:
kubectl apply -f jenkins-ingress.yaml
创建完成后,要去拉取镜像可能需要等待一会儿,然后我们查看下 Pod 的状态:
kubectl get pods -n kube-ops
NAME READY STATUS RESTARTS AGE
jenkins2-58df5d8fdc-w9xqh 1/1 Running 0 7d
等到服务启动成功后,我们就可以根据ingress 服务域名(jenkins-k8s.jiedai361.com)就可以访问 jenkins 服务了,可以根据提示信息进行安装配置即可:
初始化的密码我们可以在 jenkins 的容器的日志中进行查看,也可以直接在 nfs 的共享数据目录中查看:
cat /opt/nfs/jenkins2/secret/initAdminPassword
然后选择安装推荐的插件即可:
安装完成后添加管理员帐号即可进入到 jenkins 主界面(jenkins home) :
3)配置
接下来我们就需要来配置 Jenkins,让他能够动态的生成 Slave 的 Pod
jenkins依赖插件清单
第1步:我们需要安装kubernetes plugin, 点击 Manage Jenkins -> Manage Plugins -> Available -> Kubernetes plugin 勾选安装即可。 kubernetes plugin
第2步:安装完毕后,点击 Manage Jenkins —> Configure System —> (拖到最下方)Add a new cloud —> 选择 Kubernetes,然后填写 Kubernetes 和 Jenkins 配置信息。
注意:namespace,我们这里填 kube-ops,然后点击Test Connection,如果出现 Connection test successful 的提示信息证明Jenkins 已经可以和 Kubernetes 系统正常通信了,然后下方的 Jenkins URL 地址:http://jenkins2.kube-ops.svc.cluster.local:8080,这里的格式为:服务名.namespace.svc.cluster.local:8080,根据上面创建的jenkins的服务名填写,我这里是之前创建的名为jenkins,如果是用上面我们创建的就应该是jenkins2
第3步:配置 Pod Template,其实就是配置 Jenkins Slave 运行的 Pod 模板,命名空间我们同样是用kube-ops,Labels 这里也非常重要,对于后面执行 Job 的时候需要用到该值,然后我们这里使用的是 cnych/jenkins:jnlp 这个镜像,这个镜像是在官方的 jnlp 镜像基础上定制的,加入了 kubectl 等一些实用的工具。
另外需要注意我们这里需要在下面挂载一个主机目录,一个是 /var/run/docker.sock,该文件是用于 Pod 中的容器能够共享宿主机的 Docker,这就是大家说的 docker in docker 的方式,Docker 二进制文件我们已经打包到上面的镜像中了。
如果在slave agent中想要访问kubernetes 集群中其他资源,我们还需要绑定之前创建的Service Account 账号:jenkins2:
另外还有几个参数需要注意,如下图中的Time in minutes to retain slave when idle,这个参数表示的意思是当处于空闲状态的时候保留 Slave Pod多长时间,这个参数最好我们保存默认就行了,如果你设置过大的话,Job 任务执行完成后,对应的 Slave Pod 就不会立即被销毁删除。
到这里我们的 Kubernetes Plugin插件就算配置完成了。
4)测试
Kubernetes 插件的配置工作完成了,接下来我们就来添加一个 Job 任务,看是否能够在 Slave Pod 中执行,任务执行完成后看 Pod 是否会被销毁。
在 Jenkins 首页点击create new jobs,创建一个测试的任务,输入任务名称,然后我们选择 Freestyle project 类型的任务。
jenkins demo:
注意在下面的 Label Expression 这里要填入haimaxy-jnlp,就是前面我们配置的Slave Pod 中的 Label,这两个地方必须保持一致:
然后往下拉,在Build 区域选择Execute shell:
然后输入我们测试命令:
echo "测试 Kubernetes 动态生成 jenkins slave"
echo "==============docker in docker==========="
docker info
echo "=============kubectl============="
kubectl get pods -n kube-system
最后点击保存:
现在我们直接在页面点击做成的 Build now 触发构建即可,然后观察 Kubernetes 集群中 Pod 的变化:
$ kubectl get pods -n kube-ops
NAME READY STATUS RESTARTS AGE
jenkins2-7c85b6f4bd-rfqgv 1/1 Running 3 1d
jnlp-hfmvd 0/1 ContainerCreating 0 7s
同样也可以查看到对应的控制台信息:
到这里证明我们的任务已经构建完成,然后这个时候我们再去集群查看我们的 Pod 列表,发现 kube-ops 这个 namespace 下面已经没有之前的 Slave 这个 Pod 了。
$ kubectl get pods -n kube-ops
NAME READY STATUS RESTARTS AGE
jenkins2-7c85b6f4bd-rfqgv 1/1 Running 3 1d
到这里我们就完成了使用Kubernetes动态生成Jenkins Slave 的方法。
如何在Jenkins中来发布我们的Kubernetes应用?
5)Jenkins Pipeline简介
要实现在 Jenkins 中的构建工作,可以有多种方式,我们这里采用比较常用的 Pipeline 这种方式。Pipeline,简单来说,就是一套运行在 Jenkins 上的工作流框架,将原来独立运行于单个或者多个节点的任务连接起来,实现单个任务难以完成的复杂流程编排和可视化的工作。
Jenkins Pipeline 有几个核心概念:
那么我们如何创建 Jenkins Pipline 呢?
6)创建Pipeline
我们这里来给大家快速创建一个简单的 Pipeline,直接在 Jenkins 的 Web UI 界面中输入脚本运行。
隔一会儿,构建完成,可以点击左侧区域的 Console Output,我们就可以看到如下输出信息:
7)在Slave中构建任务
上面我们创建了一个简单的 Pipeline 任务,但是我们可以看到这个任务并没有在Jenkins 的 Slave 中运行,那么如何让我们的任务跑在 Slave 中呢?还记得上节课我们在添加 Slave Pod 的时候,一定要记住添加的 label 吗?没错,我们就需要用到这个 label,我们重新编辑上面创建的 Pipeline 脚本,给node添加一个label属性,如下:
node('haimaxy-jnlp') {
stage('Clone') {
echo "1.Clone Stage"
}
stage('Test') {
echo "2.Test Stage"
}
stage('Build') {
echo "3.Build Stage"
}
stage('Deploy') {
echo "4. Deploy Stage"
}
}
我们这里只是给 node 添加了一个 haimaxy-jnlp这样的一个label,然后我们保存,构建之前查看下 kubernetes 集群中的 Pod:
$ kubectl get pods -n kube-ops
NAME READY STATUS RESTARTS AGE
jenkins-7c85b6f4bd-rfqgv 1/1 Running 4 6d
然后重新触发立刻构建:
$ kubectl get pods -n kube-ops
NAME READY STATUS RESTARTS AGE
jenkins-7c85b6f4bd-rfqgv 1/1 Running 4 6d
jnlp-0hrrz 1/1 Running 0 23s
我们发现多了一个名叫jnlp-0hrrz的 Pod 正在运行,隔一会儿这个 Pod 就不再了:
$ kubectl get pods -n kube-ops
NAME READY STATUS RESTARTS AGE
jenkins-7c85b6f4bd-rfqgv 1/1 Running 4 6d
这也证明我们的 Job 构建完成了,同样回到 Jenkins 的 Web UI 界面中查看Console Output,可以看到如下的信息:
是不是也证明我们当前的任务在跑在上面动态生成的这个 Pod 中,也符合我们的预期。我们回到 Job 的主界面,也可以看到大家可能比较熟悉的 Stage View 界面:
https://blog.qikqiak.com/img/posts/pipeline-demo3.png
8)部署Kubernetes应用
上面我们已经知道了如何在 Jenkins Slave 中构建任务了,那么如何来部署一个原生的 Kubernetes 应用呢?
要部署 Kubernetes 应用,我们就得对我们之前部署应用的流程要非常熟悉才行,我们之前的流程是怎样的:
我们之前在Kubernetes 环境中部署一个原生应用的流程应该基本上是上面这些流程吧?现在我们就需要把上面这些流程放入 Jenkins 中来自动帮我们完成(当然编码除外),从测试到更新 YAML 文件属于 CI 流程,后面部署属于 CD 的流程。如果按照我们上面的示例,我们现在要来编写一个 Pipeline 的脚本,应该怎么编写呢?
node('haimaxy-jnlp') {
stage('Clone') {
echo "1.Clone Stage"
}
stage('Test') {
echo "2.Test Stage"
}
stage('Build') {
echo "3.Build Docker Image Stage"
}
stage('Push') {
echo "4.Push Docker Image Stage"
}
stage('YAML') {
echo "5. Change YAML File Stage"
}
stage('Deploy') {
echo "6. Deploy Stage"
}
}
这里我们来将一个简单 golang 程序,部署到 kubernetes 环境中,代码链接:https://github.com/cnych/jenkins-demo。如果按照之前的示例,我们是不是应该像这样来编写 Pipeline 脚本:
到这里我们的整个 CI/CD 的流程是不是就都完成了。
接下来我们就来对每一步具体要做的事情进行详细描述就行了:
第一步:Clone 代码
stage('Clone') {
echo "1.Clone Stage"
git url: "https://github.com/cnych/jenkins-demo.git"
}
第二步:测试
由于我们这里比较简单,忽略该步骤即可。
第三步:构建镜像
stage('Build') {
echo "3.Build Docker Image Stage"
sh "docker build -t cnych/jenkins-demo:${build_tag} ."
}
我们平时构建的时候是不是都是直接使用docker build命令进行构建就行了,那么这个地方呢?我们上节课给大家提供的 Slave Pod 的镜像里面是不是采用的 Docker In Docker 的方式,也就是说我们也可以直接在 Slave 中使用 docker build 命令,所以我们这里直接使用 sh 直接执行 docker build 命令即可,但是镜像的 tag 呢?
如果我们使用镜像 tag,则每次都是 latest 的 tag,这对于以后的排查或者回滚之类的工作会带来很大麻烦,我们这里采用和git commit的记录为镜像的 tag,这里有一个好处就是镜像的 tag 可以和 git 提交记录对应起来,也方便日后对应查看。
但是由于这个 tag 不只是我们这一个 stage 需要使用,下一个推送镜像是不是也需要,所以这里我们把这个tag 编写成一个公共的参数,把它放在 Clone 这个 stage 中,这样一来我们前两个 stage 就变成了下面这个样子:
stage('Clone') {
echo "1.Clone Stage"
git url: "https://github.com/cnych/jenkins-demo.git"
script {
build_tag = sh(returnStdout: true, script: 'git rev-parse --short HEAD').trim()
}
}
stage('Build') {
echo "3.Build Docker Image Stage"
sh "docker build -t cnych/jenkins-demo:${build_tag} ."
}
第四步:推送镜像
镜像构建完成了,现在我们就需要将此处构建的镜像推送到镜像仓库中去,当然如果你有私有镜像仓库也可以,我们这里还没有自己搭建私有的仓库,所以直接使用 docker hub 即可。
我们知道 docker hub 是公共的镜像仓库,任何人都可以获取上面的镜像,但是要往上推送镜像我们就需要用到一个帐号了,所以我们需要提前注册一个 docker hub 的帐号,记住用户名和密码,我们这里需要使用。
正常来说我们在本地推送 docker 镜像的时候,是不是需要使用docker login命令,然后输入用户名和密码,认证通过后,就可以使用docker push命令来推送本地的镜像到 docker hub 上面去了,如果是这样的话,我们这里的 Pipeline 是不是就该这样写了:
stage('Push') {
echo "4.Push Docker Image Stage"
sh "docker login -u cnych -p xxxxx"
sh "docker push cnych/jenkins-demo:${build_tag}"
}
如果我们只是在 Jenkins 的 Web UI 界面中来完成这个任务的话,我们这里的 Pipeline 是可以这样写的,但是我们是不是推荐使用 Jenkinsfile 的形式放入源码中进行版本管理,这样的话我们直接把 docker 仓库的用户名和密码暴露给别人这样很显然是非常非常不安全的,更何况我们这里使用的是 github 的公共代码仓库,所有人都可以直接看到我们的源码,所以我们应该用一种方式来隐藏用户名和密码这种私密信息,幸运的是 Jenkins 为我们提供了解决方法。
在首页点击 Credentials -> Stores scoped to Jenkins 下面的 Jenkins -> Global credentials (unrestricted) -> 左侧的 Add Credentials:添加一个 Username with password 类型的认证信息,如下:
Add Credentials 输入 docker hub 的用户名和密码,ID 部分我们输入dockerHub,注意,这个值非常重要,在后面 Pipeline 的脚本中我们需要使用到这个 ID 值。
有了上面的 docker hub 的用户名和密码的认证信息,现在我们可以在 Pipeline 中使用这里的用户名和密码了:
stage('Push') {
echo "4.Push Docker Image Stage"
withCredentials([usernamePassword(credentialsId: 'dockerHub', passwordVariable: 'dockerHubPassword', usernameVariable: 'dockerHubUser')]) {
sh "docker login -u ${dockerHubUser} -p ${dockerHubPassword}"
sh "docker push cnych/jenkins-demo:${build_tag}"
}
}
注意我们这里在 stage 中使用了一个新的函数withCredentials,其中有一个credentialsId值就是我们刚刚创建的 ID 值,然后我们就可以在脚本中直接使用这里两个变量值来直接替换掉之前的登录 docker hub 的用户名和密码,现在是不是就很安全了,我只是传递进去了两个变量而已,别人并不知道我的真正用户名和密码,只有我们自己的 Jenkins 平台上添加的才知道。
第五步:更改 YAML
上面我们已经完成了镜像的打包、推送的工作,接下来我们是不是应该更新 Kubernetes 系统中应用的镜像版本了,当然为了方便维护,我们都是用 YAML 文件的形式来编写应用部署规则,比如我们这里的 YAML 文件。
k8s.yaml:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: jenkins-demo
spec:
template:
metadata:
labels:
app: jenkins-demo
spec:
containers:
- image: cnych/jenkins-demo:
imagePullPolicy: IfNotPresent
name: jenkins-demo
对于Kubernetes比较熟悉的,对上面这个 YAML 文件一定不会陌生,我们使用一个 Deployment 资源对象来管理 Pod,该 Pod 使用的就是我们上面推送的镜像,唯一不同的地方是Docker 镜像的 tag 不是我们平常见的具体的 tag,而是一个 的标识。
实际上如果我们将这个标识替换成上面的 Docker 镜像的 tag,是不是就是最终我们本次构建需要使用到的镜像?怎么替换呢?其实也很简单,我们使用一个sed命令就可以实现了:
stage('YAML') {
echo "5. Change YAML File Stage"
sh "sed -i 's//${build_tag}/' k8s.yaml"
}
上面的 sed 命令就是将 k8s.yaml 文件中的 标识给替换成变量 build_tag 的值。
第六步:部署
Kubernetes 应用的 YAML 文件已经更改完成了,之前我们手动的环境下,是不是直接使用 kubectl apply 命令就可以直接更新应用了啊?当然我们这里只是写入到了 Pipeline 里面,思路都是一样的:
stage('Deploy') {
echo "6. Deploy Stage"
sh "kubectl apply -f k8s.yaml"
}
这样到这里我们的整个流程就算完成了。
9)人工确认
理论上来说我们上面的6个步骤其实已经完成了,但是一般在我们的实际项目实践过程中,可能还需要一些人工干预的步骤,这是为什么呢?
比如我们提交了一次代码,测试也通过了,镜像也打包上传了,但是这个版本并不一定就是要立刻上线到生产环境的,对吧,我们可能需要将该版本先发布到测试环境、QA 环境、或者预览环境之类的,总之直接就发布到线上环境去还是挺少见的,所以我们需要增加人工确认的环节,一般都是在CD的环节才需要人工干预,比如我们这里的最后两步,我们就可以在前面加上确认,比如:
stage('YAML') {
echo "5. Change YAML File Stage"
def userInput = input(
id: 'userInput',
message: 'Choose a deploy environment',
parameters: [
[
$class: 'ChoiceParameterDefinition',
choices: "Dev\nQA\nProd",
name: 'Env'
]
]
)
echo "This is a deploy step to ${userInput.Env}"
sh "sed -i 's//${build_tag}/' k8s.yaml"
}
我们这里使用了 input 关键字,里面使用一个 Choice 的列表来让用户进行选择,然后在我们选择了部署环境后,我们当然也可以针对不同的环境再做一些操作,比如可以给不同环境的 YAML 文件部署到不同的 namespace 下面去,增加不同的标签等等操作:
stage('Deploy') {
echo "6. Deploy Stage"
if (userInput.Env == "Dev") {
// deploy dev stuff
} else if (userInput.Env == "QA"){
// deploy qa stuff
} else {
// deploy prod stuff
}
sh "kubectl apply -f k8s.yaml"
}
由于这一步也属于部署的范畴,所以我们可以将最后两步都合并成一步,我们最终的 Pipeline 脚本如下:
node('haimaxy-jnlp') {
stage('Clone') {
echo "1.Clone Stage"
git url: "https://github.com/cnych/jenkins-demo.git"
script {
build_tag = sh(returnStdout: true, script: 'git rev-parse --short HEAD').trim()
}
}
stage('Test') {
echo "2.Test Stage"
}
stage('Build') {
echo "3.Build Docker Image Stage"
sh "docker build -t cnych/jenkins-demo:${build_tag} ."
}
stage('Push') {
echo "4.Push Docker Image Stage"
withCredentials([usernamePassword(credentialsId: 'dockerHub', passwordVariable: 'dockerHubPassword', usernameVariable: 'dockerHubUser')]) {
sh "docker login -u ${dockerHubUser} -p ${dockerHubPassword}"
sh "docker push cnych/jenkins-demo:${build_tag}"
}
}
stage('Deploy') {
echo "5. Deploy Stage"
def userInput = input(
id: 'userInput',
message: 'Choose a deploy environment',
parameters: [
[
$class: 'ChoiceParameterDefinition',
choices: "Dev\nQA\nProd",
name: 'Env'
]
]
)
echo "This is a deploy step to ${userInput}"
sh "sed -i 's//${build_tag}/' k8s.yaml"
if (userInput == "Dev") {
// deploy dev stuff
} else if (userInput == "QA"){
// deploy qa stuff
} else {
// deploy prod stuff
}
sh "kubectl apply -f k8s.yaml"
}
}
现在我们在 Jenkins Web UI 中重新配置 jenkins-demo 这个任务,将上面的脚本粘贴到 Script 区域,重新保存,然后点击左侧的 Build Now,触发构建,然后过一会儿我们就可以看到 Stage View 界面出现了暂停的情况:
pipeline demo5 这就是我们上面 Deploy 阶段加入了人工确认的步骤,所以这个时候构建暂停了,需要我们人为的确认下,比如我们这里选择 QA,然后点击 Proceed,就可以继续往下走了,然后构建就成功了,我们在 Stage View 的 Deploy 这个阶段可以看到如下的一些日志信息:
打印出来了 QA,和我们刚刚的选择是一致的,现在我们去 Kubernetes 集群中观察下部署的应用:
$ kubectl get deployment -n kube-ops
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
jenkins 1 1 1 1 7d
jenkins-demo 1 1 1 0 1m
$ kubectl get pods -n kube-ops
NAME READY STATUS RESTARTS AGE
jenkins-7c85b6f4bd-rfqgv 1/1 Running 4 7d
jenkins-demo-f6f4f646b-2zdrq 0/1 Completed 4 1m
$ kubectl logs jenkins-demo-f6f4f646b-2zdrq -n kube-ops
Hello, Kubernetes!I'm from Jenkins CI!
我们可以看到我们的应用已经正确的部署到了 Kubernetes 的集群环境中了。
参照以上的方法 我们简单的使用了pipeline完成了一个应用的部署功能,对于使用pipeline 有一定基础要求。
在此提供一个shell 脚本,实现一个CI/CD 流程:
#!/bin/bash
# Filename: k8s-deploy_v0.1.sh
# Description: jenkins CI/CD 持续发布脚本
# Author: yi.hu
# Email: [email protected]
# Revision: 1.0
# Date: 2018-08-10
# Note: prd
# zookeeper基础服务,依照环境实际地址配置
init() {
local lowerEnv="$(echo ${AppEnv} | tr '[:upper:]' 'lower')"
case "${lowerEnv}" in
dev)
CFG_ADDR="10.34.11.186:4181"
DR_CFG_ZOOKEEPER_ENV_URL="10.34.11.186:4181"
;;
demo)
CFG_ADDR="10.34.11.186:4181"
DR_CFG_ZOOKEEPER_ENV_URL="10.34.11.186:4181"
;;
*)
echo "Not support AppEnv: ${AppEnv}"
exit 1
;;
esac
}
# 初始化变量
AppId=$(echo ${AppOrg}_${AppEnv}_${AppName} |sed 's/[^a-zA-Z0-9_-]//g' | tr "[:lower:]" "[:upper:]")
CFG_LABEL=${CfgLabelBaseNode}/${AppId}
CFG_ADDR=${CFG_ADDR}
# 登录harbor 仓库
docker_login () {
docker login ${DOCKER_REGISTRY} -u${User} -p${PassWord}
}
# 编译代码,制作镜像
build() {
echo $ACTION
if [ "x${ACTION}" == "xDEPLOY" ] || [ "x${ACTION}" == "xPRE_DEPLOY" ]; then
echo "Test harbor registry: ${DOCKER_REGISTRY}"
#curl -X GET --connect-timeout 30 -I ${DOCKER_REGISTRY}/v2/ 2>/dev/null | grep 'HTTP/1.1 200 OK' > /dev/null
curl --connect-timeout 30 -I ${DOCKER_REGISTRY}/api/projects 2>/dev/null | grep 'HTTP/1.1 200 OK' > /dev/null
echo "Check image EXIST or NOT: ${ToImage}"
#Image_Check=$(echo ${ToImage} | sed 's/\([^/]\+\)\([^:]\+\):/\1\/v2\2\/manifests\//')
ImageCheck_Harbor=$(echo ${ToImage} | sed 's/\([^/]\+\)\([^:]\+\):/\1\/api\/repositories\2\/tags\//')
Responed_Code=$(curl -u${User}:${PassWord} -so /dev/null -w '%{response_code}' ${ImageCheck_Harbor} || true)
if [ "${NoCache}" == "true" ] || [ "x${Responed_Code}" != "x200" ] ; then
if [ "x${BuildCmd}" == "x" ]; then
echo "Generating Dockerfile"
echo "FROM ${FromImage}" > Dockerfile
cat >> Dockerfile <<- EOF
${Dockerfile}
EOF
echo "$( Dockerfile <<- EOF
FROM ${FromImage}
MAINTAINER devops
ADD ${CodeBasename} \${AppCode}
EOF
echo "$( ${WORKSPACE}/${AppName}-deploy.yaml <<- EOF
#####################################################
#
# ${ACTION} Deployment
#
#####################################################
apiVersion: apps/v1beta2 # for versions before 1.8.0 use apps/v1beta1
kind: Deployment
metadata:
name: ${AppName}
namespace: ${NameSpace}
labels:
app: ${AppName}
version: ${GitBranch}
AppEnv: ${AppEnv}
spec:
replicas: ${Replicas}
selector:
matchLabels:
app: ${AppName}
template:
metadata:
labels:
app: ${AppName}
spec:
containers:
- name: ${AppName}
image: ${ToImage}
ports:
- containerPort: ${ContainerPort}
livenessProbe:
httpGet:
path: ${HealthCheckURL}
port: ${ContainerPort}
initialDelaySeconds: 90
timeoutSeconds: 5
periodSeconds: 5
readinessProbe:
httpGet:
path: ${HealthCheckURL}
port: ${ContainerPort}
initialDelaySeconds: 5
timeoutSeconds: 5
periodSeconds: 5
# configmap env
env:
- name: CFG_LABEL
value: ${CFG_LABEL}
- name: CFG_ADDR
valueFrom:
configMapKeyRef:
name: ${ConfigMap}
key: CFG_ADDR
- name: DR_CFG_ZOOKEEPER_ENV_URL
valueFrom:
configMapKeyRef:
name: ${ConfigMap}
key: DR_CFG_ZOOKEEPER_ENV_URL
- name: CFG_FILES
valueFrom:
configMapKeyRef:
name: ${ConfigMap}
key: CFG_FILES
# configMap volume
volumeMounts:
- name: applogs
mountPath: /volume_logs/
volumes:
- name: applogs
hostPath:
path: /opt/app_logs/${AppName}
imagePullSecrets:
- name: ${ImagePullSecrets}
---
apiVersion: v1
kind: Service
metadata:
name: ${AppName}
namespace: ${NameSpace}
labels:
app: ${AppName}
spec:
ports:
- port: ${ContainerPort}
targetPort: ${ContainerPort}
selector:
app: ${AppName}
---
kind: ConfigMap
apiVersion: v1
metadata:
name: ${ConfigMap}
namespace: ${NameSpace}
data:
CFG_ADDR: ${CFG_ADDR}
DR_CFG_ZOOKEEPER_ENV_URL: ${DR_CFG_ZOOKEEPER_ENV_URL}
CFG_FILES: ${Cfs_Files}
EOF
# 执行部署
deploy
# 打印配置
cat ${WORKSPACE}/${AppName}-deploy.yaml
apiVersion: v1
kind: Service
metadata:
name: zk-hs
labels:
app: zk
spec:
selector:
app: zk
ports:
- port: 2888
name: server
- port: 3888
name: leader-election
clusterIP: None
---
apiVersion: v1
kind: Service
metadata:
name: zk-cs
labels:
app: zk
spec:
selector:
app: zk
ports:
- port: 2181
name: client
---
apiVersion: v1
kind: ConfigMap
metadata:
name: zk-config
data:
ensemble: "zk-0;zk-1;zk-2"
replicas: "3"
jvm.heap: "2048M"
tick: "2000"
init: "10"
sync: "5"
client.cnxns: "60"
snap.retain: "3"
purge.interval: "1"
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: zk-pdb
spec:
selector:
matchLabels:
app: zk
minAvailable: 2
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: zk
spec:
selector:
matchLabels:
app: zk
serviceName: zk-hs
replicas: 3
updateStrategy:
type: RollingUpdate
podManagementPolicy: OrderedReady
template:
metadata:
labels:
app: zk
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "servicetype"
operator: In
values:
- ops
topologyKey: "kubernetes.io/hostname"
containers:
- name: zk
image: registry.cn\-hangzhou.aliyuncs.com/xianlu/k8szk:v2
imagePullPolicy: IfNotPresent
ports:
- containerPort: 2181
name: client
- containerPort: 2888
name: server
- containerPort: 3888
name: leader-election
resources:
requests:
cpu: "500m"
memory: "512Mi"
env:
- name : ZK_ENSEMBLE
valueFrom:
configMapKeyRef:
name: zk-config
key: ensemble
- name : ZK_REPLICAS
valueFrom:
configMapKeyRef:
name: zk-config
key: replicas
- name : ZK_HEAP_SIZE
valueFrom:
configMapKeyRef:
name: zk-config
key: jvm.heap
- name : ZK_TICK_TIME
valueFrom:
configMapKeyRef:
name: zk-config
key: tick
- name : ZK_INIT_LIMIT
valueFrom:
configMapKeyRef:
name: zk-config
key: init
- name : ZK_SYNC_LIMIT
valueFrom:
configMapKeyRef:
name: zk-config
key: tick
- name : ZK_MAX_CLIENT_CNXNS
valueFrom:
configMapKeyRef:
name: zk-config
key: client.cnxns
- name: ZK_SNAP_RETAIN_COUNT
valueFrom:
configMapKeyRef:
name: zk-config
key: snap.retain
- name: ZK_PURGE_INTERVAL
valueFrom:
configMapKeyRef:
name: zk-config
key: purge.interval
- name: ZK_CLIENT_PORT
value: "2181"
- name: ZK_SERVER_PORT
value: "2888"
- name: ZK_ELECTION_PORT
value: "3888"
command:
- sh
- -c
- zkGenConfig.sh && zkServer.sh start-foreground
readinessProbe:
exec:
command:
- "zkOk.sh"
initialDelaySeconds: 15
timeoutSeconds: 5
livenessProbe:
exec:
command:
- "zkOk.sh"
initialDelaySeconds: 15
timeoutSeconds: 5
volumeMounts:
- name: data
mountPath: /var/lib/zookeeper
volumes:
- name: data
emptyDir: {}
securityContext:
runAsUser: 1000
fsGroup: 1000
# volumeClaimTemplates:
# - metadata:
# name: data
# spec:
# accessModes: [ "ReadWriteOnce" ]
# storageClassName: "gluster-heketi-2"
# resources:
# requests:
# storage: 2Gi
架构图:
1)使用helm在CCE部署rabbitmq-exporter
安装helm:
wget https://get.helm.sh/helm-v3.3.4-linux-amd64.tar.gz
tart -zxvf helm-v3.3.4-linux-amd64.tar.gz
mv linux-amd64/helm /usr/local/bin/helm
helm version
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /root/.kube/config
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /root/.kube/config
version.BuildInfo{Version:"v3.3.4", GitCommit:"a61ce5633af99708171414353ed49547cf05013d", GitTreeState:"clean", GoVersion:"go1.14.9"}
You have new mail in /var/spool/mail/root
部署rabbitmq-exporter:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prom-rabbit prometheus-community/prometheus-rabbitmq-exporter --set "rabbitmq.url=http://rabbitserver:15672" --set "rabbitmq.user=XXXt" --set "rabbitmq.password=XXXXX" --namespace=monitoring
验证部署是否成功:
kubectl get pod,svc -n monitoring
编辑prometheus configmap,添加 job_name ,rabbitmq-exporter 监控指标存入prometheus:
注意target 为上图rabbitmq-exporter service 服务地址。
kind: ConfigMap
apiVersion: v1
metadata:
name: prometheus
namespace: monitoring
selfLink: /api/v1/namespaces/monitoring/configmaps/prometheus
uid: 036a2fbf-3718-4372-a138-672c62898048
resourceVersion: '3126060'
creationTimestamp: '2021-08-26T02:58:45Z'
labels:
app: prometheus
chart: prometheus-2.21.11
component: server
heritage: Tiller
release: cceaddon-prometheus
annotations:
description: ''
managedFields:
- manager: Go-http-client
operation: Update
apiVersion: v1
time: '2021-08-28T02:53:49Z'
fieldsType: FieldsV1
fieldsV1:
'f:data':
.: {}
'f:prometheus.yml': {}
'f:metadata':
'f:annotations':
.: {}
'f:description': {}
'f:labels':
.: {}
'f:app': {}
'f:chart': {}
'f:component': {}
'f:heritage': {}
'f:release': {}
data:
prometheus.yml: |-
global:
evaluation_interval: 1m
scrape_interval: 15s
scrape_timeout: 10s
alerting:
alertmanagers:
- scheme: https
tls_config:
insecure_skip_verify: true
static_configs:
- targets:
-
alert_relabel_configs:
- source_labels: [kubernetes_pod]
action: replace
target_label: pod
regex: (.+)
- source_labels: [pod_name]
action: replace
target_label: pod
regex: (.+)
rule_files:
- /etc/prometheus/rules/*/*.yaml
scrape_configs:
- bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
job_name: kubernetes-cadvisor
kubernetes_sd_configs:
- role: node
relabel_configs:
- replacement: kubernetes.default.svc:443
target_label: __address__
- regex: (.+)
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
source_labels:
- __meta_kubernetes_node_name
target_label: __metrics_path__
- target_label: cluster
replacement: f4486e8b-00dc-11ec-a6bf-0255ac1000cf
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
- job_name: kubernetes-nodes
kubernetes_sd_configs:
- role: node
relabel_configs:
- regex: (.+)
replacement: $1:9100
source_labels:
- __meta_kubernetes_node_name
target_label: __address__
- target_label: cluster
replacement: f4486e8b-00dc-11ec-a6bf-0255ac1000cf
- target_label: node
source_labels: [instance]
- job_name: kubernetes-service-endpoints
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- target_label: cluster
replacement: f4486e8b-00dc-11ec-a6bf-0255ac1000cf
- action: keep
regex: true
source_labels:
- __meta_kubernetes_service_annotation_prometheus_io_scrape
- action: replace
regex: (https?)
source_labels:
- __meta_kubernetes_service_annotation_prometheus_io_scheme
target_label: __scheme__
- action: replace
regex: (.+)
source_labels:
- __meta_kubernetes_service_annotation_prometheus_io_path
target_label: __metrics_path__
- action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
source_labels:
- __address__
- __meta_kubernetes_service_annotation_prometheus_io_port
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: kubernetes_namespace
- action: replace
source_labels:
- __meta_kubernetes_service_name
target_label: kubernetes_service
- honor_labels: false
job_name: kubernetes-pods
kubernetes_sd_configs:
- role: pod
relabel_configs:
- target_label: cluster
replacement: f4486e8b-00dc-11ec-a6bf-0255ac1000cf
- action: keep
regex: true
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_scrape
- action: drop
regex: cceaddon-prometheus-node-exporter-(.+)
source_labels:
- __meta_kubernetes_pod_name
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: kubernetes_namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: kubernetes_pod
- action: replace
regex: (.+)
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_path
target_label: __metrics_path__
- action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
source_labels:
- __address__
- __meta_kubernetes_pod_annotation_prometheus_io_port
target_label: __address__
- action: replace
regex: (.+)
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_scheme
target_label: __scheme__
metric_relabel_configs:
- source_labels: [ __name__ ]
regex: 'kube_node_labels'
action: drop
tls_config:
insecure_skip_verify: true
- job_name: 'istio-mesh'
metrics_path: /stats/prometheus
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_container_port_name]
action: keep
regex: http-envoy-prom
metric_relabel_configs:
- target_label: cluster
replacement: f4486e8b-00dc-11ec-a6bf-0255ac1000cf
- source_labels: [__name__]
action: keep
regex: istio.*
- job_name: 'rabbitmq-exporter'
static_configs:
- targets: ['prom-rabbit-exporter-prometheus-rabbitmq-exporter:9419']
metric_relabel_configs:
- target_label: namespace
replacement: default
使用prometheus 控制平台验证 target ,查看rabbitmq-exporter 部署是否成功:
验证rabbitmq_queue_messages 消息队列数值:
2)构建自定义metric
修改 adapter-config,自定查询规则:
新增字段:externalRules。
kind: ConfigMap
apiVersion: v1
metadata:
name: adapter-config
namespace: monitoring
selfLink: /api/v1/namespaces/monitoring/configmaps/adapter-config
uid: 9b559a81-b0f0-483f-ab4a-65df6073efcd
resourceVersion: '3131912'
creationTimestamp: '2021-08-26T02:58:45Z'
labels:
release: cceaddon-prometheus
managedFields:
- manager: Go-http-client
operation: Update
apiVersion: v1
time: '2021-08-26T02:58:45Z'
fieldsType: FieldsV1
fieldsV1:
'f:data':
.: {}
'f:config.yaml': {}
'f:metadata':
'f:labels':
.: {}
'f:release': {}
data:
config.yaml: |-
rules:
- seriesQuery: '{__name__=~"^container_.*",container_name!="POD",namespace!="",pod_name!=""}'
seriesFilters: []
resources:
overrides:
namespace:
resource: namespace
pod_name:
resource: pod
name:
matches: ^container_(.*)_seconds_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}[1m])) by (<<.GroupBy>>)
- seriesQuery: '{__name__=~"^container_.*",container_name!="POD",namespace!="",pod_name!=""}'
seriesFilters:
- isNot: ^container_.*_seconds_total$
resources:
overrides:
namespace:
resource: namespace
pod_name:
resource: pod
name:
matches: ^container_(.*)_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}[1m])) by (<<.GroupBy>>)
- seriesQuery: '{__name__=~"^container_.*",container_name!="POD",namespace!="",pod_name!=""}'
seriesFilters:
- isNot: ^container_.*_total$
resources:
overrides:
namespace:
resource: namespace
pod_name:
resource: pod
name:
matches: ^container_(.*)$
as: ""
metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}) by (<<.GroupBy>>)
- seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
seriesFilters:
- isNot: .*_total$
resources:
template: <<.Resource>>
name:
matches: ""
as: ""
metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)
- seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
seriesFilters:
- isNot: .*_seconds_total
resources:
template: <<.Resource>>
name:
matches: ^(.*)_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
- seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
seriesFilters: []
resources:
template: <<.Resource>>
name:
matches: ^(.*)_seconds_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
- seriesQuery: '{kubernetes_namespace!="",__name__!~"^container_.*"}'
seriesFilters:
- isNot: .*_total$
resources:
overrides:
kubernetes_namespace:
resource: namespace
kubernetes_pod:
resource: pod
name:
matches: ""
as: ""
metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)
- seriesQuery: '{kubernetes_namespace!="",__name__!~"^container_.*"}'
seriesFilters:
- isNot: .*_seconds_total
resources:
overrides:
kubernetes_namespace:
resource: namespace
kubernetes_pod:
resource: pod
name:
matches: ^(.*)_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
- seriesQuery: '{kubernetes_namespace!="",kubernetes_service!=""}'
seriesFilters:
- isNot: .*_seconds_total
resources:
overrides:
kubernetes_namespace:
resource: namespace
kubernetes_service:
resource: service
name:
matches: ^(.*)_total$
as: ""
metricsQuery: (avg(sum(rate(<<.Series>>{}[5m])) by (kubernetes_service, instance)) by (kubernetes_service))
- seriesQuery: '{kubernetes_namespace!="",__name__!~"^container_.*"}'
seriesFilters: []
resources:
overrides:
kubernetes_namespace:
resource: namespace
kubernetes_pod:
resource: pod
name:
matches: ^(.*)_seconds_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
resourceRules:
cpu:
containerQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
nodeQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>, id='/'}[1m])) by (<<.GroupBy>>)
resources:
overrides:
instance:
resource: node
namespace:
resource: namespace
pod_name:
resource: pod
containerLabel: container_name
memory:
containerQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>}) by (<<.GroupBy>>)
nodeQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>,id='/'}) by (<<.GroupBy>>)
resources:
overrides:
instance:
resource: node
namespace:
resource: namespace
pod_name:
resource: pod
containerLabel: container_name
window: 1m
externalRules:
- seriesQuery: '{__name__=~"^rabbitmq_.*",queue="input.service.run_intelligent_classification"}'
resources:
template: <<.Resource>>
name:
matches: ""
as: ""
metricsQuery: 'max(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)'
自定义custom-metrics-apiserver k8s apiserver ,创建k8s-api.yml:
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
name: v1beta1.external.metrics.k8s.io
spec:
service:
name: custom-metrics-apiserver
namespace: monitoring
group: external.metrics.k8s.io
version: v1beta1
insecureSkipTLSVerify: true
groupPriorityMinimum: 100
versionPriority: 100
验证rabbitmq_queue_messages 指标, 是否能从metric-server 查询:
# kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/rabbitmq_queue_messages"
{"kind":"ExternalMetricValueList","apiVersion":"external.metrics.k8s.io/v1beta1","metadata":{"selfLink":"/apis/external.metrics.k8s.io/v1beta1/namespaces/default/rabbitmq_queue_messages"},"items":[{"metricName":"rabbitmq_queue_messages","metricLabels":{},"timestamp":"2021-08-28T10:37:16Z","value":"32847"}]}
编辑hpa-controller-custom-metrics clusterrolebinding权限,修改namespace:
kubectl edit clusterrolebinding -oyaml hpa-controller-custom-metrics
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
creationTimestamp: "2021-08-26T02:58:45Z"
labels:
release: cceaddon-prometheus
name: hpa-controller-custom-metrics
resourceVersion: "3147794"
selfLink: /apis/rbac.authorization.k8s.io/v1/clusterrolebindings/hpa-controller-custom-metrics
uid: d7ab1332-e54f-42ad-bb51-0b0c5db4f386
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: custom-metrics-server-resources
subjects:
- kind: ServiceAccount
name: horizontal-pod-autoscaler
namespace: kube-system # monitering修改为kube-system
编辑 custom-metrics-server-resources ,clusterrole 新增apiGroups : external.metrics.k8s.io
kubectl edit clusterrole custom-metrics-server-resources
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
creationTimestamp: "2021-08-26T02:58:45Z"
labels:
rbac.authorization.k8s.io/aggregate-to-admin: "true"
rbac.authorization.k8s.io/aggregate-to-edit: "true"
rbac.authorization.k8s.io/aggregate-to-view: "true"
release: cceaddon-prometheus
name: custom-metrics-server-resources
resourceVersion: "3140104"
selfLink: /apis/rbac.authorization.k8s.io/v1/clusterroles/custom-metrics-server-resources
uid: 5fb1a3d6-6da2-4197-a039-55f7770cca3e
rules:
- apiGroups:
- custom.metrics.k8s.io
- external.metrics.k8s.io # 新增external组
resources:
- '*'
verbs:
- '*'
部署nginx deployment服务,并创建HPA 策略:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: scale-nginx
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: ngin
minReplicas: 1 # 最小副本数
maxReplicas: 5 # 最大副本数
metrics:
- type: External
external:
metricName: rabbitmq_queue_messages # 自定义监控服务名称
targetValue: 1000 # 消息队列大于1000,nginx服务开始扩容
HPA RBAC 权限报错:
[root@alg-ty-69070 ~]# kubectl get hpa -oyaml scale
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
annotations:
autoscaling.alpha.kubernetes.io/conditions: '[{"type":"AbleToScale","status":"True","lastTransitionTime":"2021-08-28T08:48:11Z","reason":"SucceededGetScale","message":"the
HPA controller was able to get the target''s current scale"},{"type":"ScalingActive","status":"False","lastTransitionTime":"2021-08-28T08:48:11Z","reason":"FailedGetExternalMetric","message":"the
HPA was unable to compute the replica count: unable to get external metric default/rabbitmq_queue_messages/nil:
unable to fetch metrics from external metrics API: rabbitmq_queue_messages.external.metrics.k8s.io
is forbidden: User \"system:serviceaccount:kube-system:horizontal-pod-autoscaler\"
cannot list resource \"rabbitmq_queue_messages\" in API group \"external.metrics.k8s.io\"
in the namespace \"default\""}]'
autoscaling.alpha.kubernetes.io/metrics: '[{"type":"External","external":{"metricName":"rabbitmq_queue_messages","targetValue":"1k"}}]'
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"autoscaling/v2beta1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"scale","namespace":"default"},"spec":{"maxReplicas":10,"metrics":[{"external":{"metricName":"rabbitmq_queue_messages","targetValue":1000},"type":"External"}],"minReplicas":1,"scaleTargetRef":{"apiVersion":"apps/v1","kind":"Deployment","name":"alg-ty-es"}}}
creationTimestamp: "2021-08-28T08:47:56Z"
managedFields:
- apiVersion: autoscaling/v2beta1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:kubectl.kubernetes.io/last-applied-configuration: {}
f:spec:
f:maxReplicas: {}
f:metrics: {}
f:minReplicas: {}
f:scaleTargetRef:
f:apiVersion: {}
f:kind: {}
f:name: {}
manager: kubectl-client-side-apply
operation: Update
time: "2021-08-28T08:47:56Z"
在k8s中部署nacos集群和在裸机器上直接部署nacos机器其实差别不大。
最主要的区别是k8s中部署的服务没有固定的ip地址,而nacos集群部署需要配置所有实例的ip
解决:
实现方式:
1)创建数据库配置
---
apiVersion: v1
kind: Secret
metadata:
name: nacos
type: Opaque
data:
mysql.db.host: "NDFjZTgyZjY4OWE5NGU4ZDk4YmRlOWQ2MDQ2ZTA0YWRpbjAxLmludGVybmFsLmNuLWVhc3QtMy5teXNxbC5yZHMubXlodWF3ZWljbG91ZC5jb20="
mysql.db.name: "bmFjb3M="
mysql.db.port: "MzMwNg=="
mysql.db.user: "bmFjb3M="
mysql.db.password: "V0ExNmdvVWE2bU5oUmlqRg=="
---
比如我们来创建一个用户名为 nacos, 的 Secret 对象,首先我们先把这用户名和密码做 base64 编码:
$ echo -n 'nacos' | openssl base64
bmFjb3M=
2)部署Headless Service
Headless Service为每个pod(nacos实例)生成一个DNS地址,用作NACOS_SERVERS配置:
---
apiVersion: v1
kind: Service
metadata:
name: nacos-headless
labels:
app: nacos
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
spec:
ports:
- port: 8848
name: server
targetPort: 8848
- port: 9848
name: client-rpc
targetPort: 9848
- port: 9849
name: raft-rpc
targetPort: 9849
## 兼容1.4.x版本的选举端口
- port: 7848
name: old-raft-rpc
targetPort: 7848
clusterIP: None
selector:
app: nacos
---
3)通过StatefulSet部署nacos
StatefulSet部署方式为每个POD生成固定的名称,如nacos-0、nacos-1、nacos-2等。
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: nacos
spec:
serviceName: nacos-headless
replicas: 3
template:
metadata:
labels:
app: nacos
annotations:
pod.alpha.kubernetes.io/initialized: "true"
spec:
volumes:
- name: vol-163912341665228473
hostPath:
path: /opt/logs/
type: ''
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- nacos
topologyKey: "kubernetes.io/hostname"
initContainers:
- name: peer-finder-plugin-install
image: nacos/nacos-peer-finder-plugin:1.1
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /home/nacos/plugins/peer-finder
name: data
subPath: peer-finder
containers:
- name: nacos
imagePullPolicy: Always
image: swr.cn-east-3.myhuaweicloud.com/huyi-base/nacos-server:2.0.3
resources:
limits:
cpu: '2'
memory: 4Gi
requests:
memory: '4Gi'
cpu: '2'
ports:
- containerPort: 8848
name: client-port
- containerPort: 9848
name: client-rpc
- containerPort: 9849
name: raft-rpc
- containerPort: 7848
name: old-raft-rpc
env:
- name: NACOS_REPLICAS
value: "3"
- name: SERVICE_NAME
value: "nacos-headless"
- name: DOMAIN_NAME
value: "cluster.local"
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- name: MYSQL_SERVICE_HOST
valueFrom:
secretKeyRef:
name: nacos
key: mysql.db.host
- name: MYSQL_SERVICE_DB_NAME
valueFrom:
secretKeyRef:
name: nacos
key: mysql.db.name
- name: MYSQL_SERVICE_PORT
valueFrom:
secretKeyRef:
name: nacos
key: mysql.db.port
- name: MYSQL_SERVICE_USER
valueFrom:
secretKeyRef:
name: nacos
key: mysql.db.user
- name: MYSQL_SERVICE_PASSWORD
valueFrom:
secretKeyRef:
name: nacos
key: mysql.db.password
- name: NACOS_SERVER_PORT
value: "8848"
- name: NACOS_APPLICATION_PORT
value: "8848"
- name: PREFER_HOST_MODE
value: "hostname"
volumeMounts:
- name: data
mountPath: /home/nacos/plugins/peer-finder
subPath: peer-finder
- name: data
mountPath: /home/nacos/data
subPath: data
- name: vol-163912341665228473
mountPath: /home/nacos/logs/
policy:
logs:
rotate: Hourly
annotations:
format: '{"multi":{"mode":"time","value":"YYYY-MM-DD hh:mm:ss"}}'
pathPattern: nacos.log
imagePullSecrets:
- name: default-secret
volumeClaimTemplates:
- metadata:
name: data
annotations:
everest.io/disk-volume-type: SAS
labels:
failure-domain.beta.kubernetes.io/region: cn-east-3
failure-domain.beta.kubernetes.io/zone: cn-east-3a
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 20Gi
storageClassName: csi-disk
selector:
matchLabels:
app: nacos
4)初始化建表 mysql
参考: MySQL 建表
电子签章架构图: