前言:
目前项目上了rancher的K8S,rancher自带的应用商店可以一键部署EFK集群,但是生产环境有安全性的需求,这里需要对这个EFK集群进行改造,增加用户名密码的验证登陆.
1.efk基础设置
这里采用的是rancher自带的应用商店里的efk,并自定义了镜像地址(harbor转储)
所有镜像均取自elastic的官方源,镜像版本均为7.7.1:
镜像下载地址:https://www.docker.elastic.co/
由于日志数据不太重要,就没有选择持久化数据,这样性能也会相对好一点,缺点是如果重新部署,elasticsearch的数据都会清空。目前rancher自己的分布式存储longhorn也正式发布了,配置也简单,有条件的可以考虑将数据存放到分布式存储上.
2.配置信息变更
2.1 elasticsearch 的StatefulSet配置变更:
变更的参数:
env: ES_JAVA_OPTS跟认证无关,默认配置资源太少,容易oom;ELASTIC_USERNAME,ELASTIC_PASSWORD是为了elasticsearch集群的状态检测准备的
- name: ES_JAVA_OPTS
value: -Xmx4g -Xms4g
- name: xpack.security.enabled
value: "true"
- name: ELASTIC_USERNAME
value: elastic
- name: ELASTIC_PASSWORD
value: elasticpassword
resource:跟开启用户认证无关,默认配置资源太少,容易oom
resources:
limits:
cpu: "4"
memory: 8Gi
requests:
cpu: 100m
memory: 8Gi
附上rancher上完整的yaml文件:
apiVersion: apps/v1
kind: StatefulSet
metadata:
annotations:
esMajorVersion: "7"
field.cattle.io/publicEndpoints: '[{"addresses":["10.1.99.51"],"port":80,"protocol":"HTTP","serviceName":"efk:elasticsearch-master-headless","ingressName":"efk:elastic-ingress","hostname":"elastic-prod.hlet.com","allNodes":true}]'
creationTimestamp: "2020-06-03T08:34:13Z"
generation: 4
labels:
app: elasticsearch-master
chart: elasticsearch-7.3.0
heritage: Tiller
io.cattle.field/appId: efk
release: efk
name: elasticsearch-master
namespace: efk
resourceVersion: "22963322"
selfLink: /apis/apps/v1/namespaces/efk/statefulsets/elasticsearch-master
uid: 03f40362-4e89-4bd1-b8d3-285a36cbce35
spec:
podManagementPolicy: Parallel
replicas: 5
revisionHistoryLimit: 10
selector:
matchLabels:
app: elasticsearch-master
serviceName: elasticsearch-master-headless
template:
metadata:
creationTimestamp: null
labels:
app: elasticsearch-master
chart: elasticsearch-7.3.0
heritage: Tiller
release: efk
name: elasticsearch-master
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- elasticsearch-master
topologyKey: kubernetes.io/hostname
containers:
- env:
- name: node.name
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: cluster.initial_master_nodes
value: elasticsearch-master-0,elasticsearch-master-1,elasticsearch-master-2,elasticsearch-master-3,elasticsearch-master-4,
- name: discovery.seed_hosts
value: elasticsearch-master-headless
- name: cluster.name
value: elasticsearch
- name: network.host
value: 0.0.0.0
- name: ES_JAVA_OPTS
value: -Xmx4g -Xms4g
- name: node.data
value: "true"
- name: node.ingest
value: "true"
- name: node.master
value: "true"
- name: xpack.security.enabled
value: "true"
- name: ELASTIC_USERNAME
value: elastic
- name: ELASTIC_PASSWORD
value: elasticpassword
image: 10.1.99.42/ranchercharts/elasticsearch-elasticsearch:7.7.1
imagePullPolicy: IfNotPresent
name: elasticsearch
ports:
- containerPort: 9200
name: http
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
readinessProbe:
exec:
command:
- sh
- -c
- |
#!/usr/bin/env bash -e
# If the node is starting up wait for the cluster to be ready (request params: 'wait_for_status=green&timeout=1s' )
# Once it has started only check that the node itself is responding
START_FILE=/tmp/.es_start_file
http () {
local path="${1}"
if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
BASIC_AUTH="-u ${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
else
BASIC_AUTH=''
fi
curl -XGET -s -k --fail ${BASIC_AUTH} http://127.0.0.1:9200${path}
}
if [ -f "${START_FILE}" ]; then
echo 'Elasticsearch is already running, lets check the node is healthy'
http "/"
else
echo 'Waiting for elasticsearch cluster to become cluster to be ready (request params: "wait_for_status=green&timeout=1s" )'
if http "/_cluster/health?wait_for_status=green&timeout=1s" ; then
touch ${START_FILE}
exit 0
else
echo 'Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )'
exit 1
fi
fi
failureThreshold: 3
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 3
timeoutSeconds: 5
resources:
limits:
cpu: "4"
memory: 8Gi
requests:
cpu: 100m
memory: 8Gi
securityContext:
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
initContainers:
- command:
- sysctl
- -w
- vm.max_map_count=262144
image: 10.1.99.42/ranchercharts/elasticsearch-elasticsearch:7.7.1
imagePullPolicy: IfNotPresent
name: configure-sysctl
resources: {}
securityContext:
privileged: true
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 1000
terminationGracePeriodSeconds: 120
updateStrategy:
type: RollingUpdate
status:
collisionCount: 0
currentReplicas: 5
currentRevision: elasticsearch-master-85f58497dd
observedGeneration: 4
readyReplicas: 5
replicas: 5
updateRevision: elasticsearch-master-85f58497dd
updatedReplicas: 5
配置完后点击保存,elasticsearch集群会自动重新部署
注意:如果集群一直不能初始化完成,建议一次性删除所有elastic节点,让节点完全重新初始化
待重新部署完成后,我们需要初始化一下elastic内置的账户密码:
登陆任意一台elastic,执行命令:
elasticsearch-setup-passwords interactive
至此,elasticsearch集群初始化完成
2.2 kibana 配置变更
因为是使用的应用商店自动部署的,所以会自动生成两个service,分别是efk-kibana和kibana-http,
在实际配置中,将service应用到ingress的时候,出现了无法访问的问题,具体的问题是在kibana本地访问http://0.0.0.0:5601 是可以访问的,但是使用http://efk-kibana:5601 访问就不通,后来就重新加了一个efk-kibana-headless的无头服务,并应用至kibana的ingress配置上去就好了。后来晚些时候service自己又恢复正常了。。。
[root@hlet-prod-k8s-rancher ~]# kubectl get svc -n efk
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
efk-kibana ClusterIP 10.43.127.11 5601/TCP 17d
efk-kibana-headless ClusterIP None 5601/TCP 130m
elasticsearch-apm ClusterIP 10.43.238.31 8200/TCP 52d
elasticsearch-heartbeat ClusterIP 10.43.172.214 9200/TCP 2d
elasticsearch-master ClusterIP 10.43.21.168 9200/TCP,9300/TCP 17d
elasticsearch-master-headless ClusterIP None 9200/TCP,9300/TCP 17d
kibana-http ClusterIP 10.43.71.157 80/TCP 174m
ingress配置:
svc配置自带的就不贴了
kibana的yaml主要修改了两块:
ENV:两组用户名密码分别是连接elastic集群的用户名密码和存活检测脚本调用
- name: xpack.security.enabled
value: "true"
- name: ELASTICSEARCH_USERNAME
value: kibana
- name: ELASTIC_USERNAME
value: kibana
- name: ELASTICSEARCH_PASSWORD
value: elasticpassword
- name: ELASTIC_PASSWORD
value: elasticpassword
存活检测:就改了最后一行,默认的地址在开启认证后没有登陆会一直报404
readinessProbe:
exec:
command:
- sh
- -c
- |
#!/usr/bin/env bash -e
http () {
local path="${1}"
set -- -XGET -s --fail
if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
set -- "$@" -u "${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
fi
curl -k "$@" "http://localhost:5601${path}"
}
http "/login"
附上完整的Deployment的yaml配置:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "23"
field.cattle.io/publicEndpoints: '[{"addresses":["10.1.99.51"],"port":80,"protocol":"HTTP","serviceName":"efk:kibana-http","ingressName":"efk:kibana-ingress","hostname":"kibana-prod.hlet.com","allNodes":true}]'
creationTimestamp: "2020-05-26T00:53:53Z"
generation: 49
labels:
app: kibana
io.cattle.field/appId: efk
release: efk
name: efk-kibana
namespace: efk
resourceVersion: "23026049"
selfLink: /apis/apps/v1/namespaces/efk/deployments/efk-kibana
uid: 85017148-3738-46f9-8e29-65d072549a92
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: kibana
release: efk
strategy:
type: Recreate
template:
metadata:
annotations:
cattle.io/timestamp: "2020-06-09T00:17:32Z"
field.cattle.io/ports: '[[{"containerPort":80,"dnsName":"efk-kibana","kind":"ClusterIP","name":"http","protocol":"TCP"}],[{"containerPort":5601,"dnsName":"efk-kibana","kind":"ClusterIP","name":"5601tcp2","protocol":"TCP"}]]'
field.cattle.io/publicEndpoints: '[{"addresses":["10.1.99.51"],"allNodes":true,"hostname":"kibana-prod.hlet.com","ingressId":"efk:kibana-ingress","port":80,"protocol":"HTTP","serviceId":"efk:kibana-http"}]'
creationTimestamp: null
labels:
app: kibana
release: efk
spec:
containers:
- args:
- nginx
- -g
- daemon off;
- -c
- /nginx/nginx.conf
image: rancher/nginx:1.15.8-alpine
imagePullPolicy: IfNotPresent
name: kibana-proxy
ports:
- containerPort: 80
name: http
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /nginx/
name: kibana-nginx
- env:
- name: ELASTICSEARCH_HOSTS
value: http://elasticsearch-master:9200
- name: I18N_LOCALE
value: zh-CN
- name: LOGGING_QUIET
value: "true"
- name: SERVER_HOST
value: 0.0.0.0
- name: xpack.security.enabled
value: "true"
- name: ELASTICSEARCH_USERNAME
value: kibana
- name: ELASTIC_USERNAME
value: kibana
- name: ELASTICSEARCH_PASSWORD
value: elasticpassword
- name: ELASTIC_PASSWORD
value: elasticpassword
image: 10.1.99.42/ranchercharts/kibana-kibana:7.7.1
imagePullPolicy: IfNotPresent
name: kibana
ports:
- containerPort: 5601
name: 5601tcp2
protocol: TCP
readinessProbe:
exec:
command:
- sh
- -c
- |
#!/usr/bin/env bash -e
http () {
local path="${1}"
set -- -XGET -s --fail
if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
set -- "$@" -u "${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
fi
curl -k "$@" "http://localhost:5601${path}"
}
http "/login"
failureThreshold: 3
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 3
timeoutSeconds: 5
resources:
limits:
cpu: "1"
memory: 1Gi
requests:
cpu: 100m
memory: 500m
securityContext:
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 1000
terminationGracePeriodSeconds: 30
volumes:
- configMap:
defaultMode: 420
items:
- key: nginx.conf
mode: 438
path: nginx.conf
name: efk-kibana-nginx
name: kibana-nginx
status:
availableReplicas: 1
conditions:
- lastTransitionTime: "2020-06-12T07:46:09Z"
lastUpdateTime: "2020-06-12T07:46:09Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
- lastTransitionTime: "2020-06-12T07:29:26Z"
lastUpdateTime: "2020-06-12T07:46:09Z"
message: ReplicaSet "efk-kibana-9884bd66b" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
observedGeneration: 49
readyReplicas: 1
replicas: 1
updatedReplicas: 1
到这里就可以尝试登陆kibana了登陆界面:
2.3 apm 配置变更
由于我们elastic的组件还使用到了apm,继续修改apm相关设置
原始部署相关步骤:
apm是不包含在应用商店中的,部署相关yaml:
部署顺序:
kubectl create configmap elasticsearch-apm --from-file=apm-server.docker.yml -n efk
kubectl apply -f elasticsearch-apm-server.yaml
apm-server.docker.yml:
apm-server:
host: "0.0.0.0:8200"
kibana.enabled: true
kibana.host: "efk-kibana:5601"
kibana.protocol: "http"
logging.level: warning
output.elasticsearch:
hosts: ["elasticsearch-master-headless:9200"]
apm.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
labels:
app: elasticsearch-apm
name: elasticsearch-apm
namespace: efk
spec:
replicas: 1
revisionHistoryLimit: 2
selector:
matchLabels:
app: elasticsearch-apm
template:
metadata:
labels:
app: elasticsearch-apm
spec:
containers:
- image: 10.1.99.42/docker.elastic.co/apm/apm-server:7.7.1
imagePullPolicy: IfNotPresent
name: elasticsearch-apm
ports:
- containerPort: 8200
protocol: TCP
resources:
limits:
cpu: "1"
requests:
cpu: 25m
memory: 512Mi
volumeMounts:
- mountPath: /usr/share/apm-server/apm-server.yml
name: config
subPath: apm-server.docker.yml
volumes:
- configMap:
defaultMode: 420
name: elasticsearch-apm
name: config
---
apiVersion: v1
kind: Service
metadata:
labels:
app: elasticsearch-apm
name: elasticsearch-apm
namespace: efk
spec:
ports:
- name: elasticsearch-apm
port: 8200
protocol: TCP
selector:
app: elasticsearch-apm
修改配置文件,适配用户认证
修改elasticsearch-apm这个configmap
apm-server.docker.yml
apm-server:
host: "0.0.0.0:8200"
kibana.enabled: true
kibana.host: "efk-kibana-headless:5601"
kibana.username: "elastic"
kibana.password: "elasticpassword"
kibana.protocol: "http"
logging.level: warning
#logging.level: info
output.elasticsearch:
hosts: ["elasticsearch-master-headless:9200"]
username: "elastic"
password: "elasticpassword"
修改完成后,重新部署一下即可。
2.4 filebeat 配置变更
应用商店自带的,直接修改相应的configmap即可
修改efk-filebeat-config这个configmap
filebeat.yml:
filebeat.inputs:
- type: docker
containers.ids:
- '*'
processors:
- add_kubernetes_metadata:
in_cluster: true
output.elasticsearch:
hosts: '${ELASTICSEARCH_HOSTS:elasticsearch-master:9200}'
username: "elastic"
password: "elasticpassword"