Flink 1.11.2 在K8s里基于NFS搭建高可用集群
一文中,用于生产环境中发现一个问题,就是在输入流量大的情况下,经常出现checkpoint失败的情况。经排查发现是checkpooint的存储方式有问题。改用了rocksdb以后终于好了。下面将修改步骤记录如下:
创建jobmanager-checkpoint-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: jobmanager-checkpoint-pvc
namespace: flink-ha
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 30Gi
storageClassName: nfs
创建taskmanager-checkpoint-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: taskmanager-checkpoint-pvc
namespace: flink-ha
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 30Gi
storageClassName: nfs
kubectl apply -f jobmanager-checkpoint-pvc.yaml
kubectl apply -f taskmanager-checkpoint-pvc.yaml
修改jobmanager-flink-conf.yaml 用于jobmanager配置
apiVersion: v1
data:
flink-conf.yaml: |-
jobmanager.rpc.address: localhost
blob.server.port: 6124
jobmanager.rpc.port: 6123
taskmanager.rpc.port: 6122
queryable-state.proxy.ports: 6125
jobmanager.memory.process.size: 3200m
taskmanager.memory.process.size: 10240m
taskmanager.numberOfTaskSlots: 1
parallelism.default: 1
high-availability: zookeeper
high-availability.cluster-id: /flink-cluster
high-availability.storageDir: file:/usr/flink/ha/flink-cluster
high-availability.zookeeper.quorum: 192.168.1.205:2181
# state.backend: filesystem
# state.checkpoints.dir: file:/usr/flink/flink-checkpoints
# state.checkpoints.num-retained: 100
# state.savepoints.dir: hdfs://namenode-host:port/flink-checkpoints
state.backend: rocksdb
state.backend.incremental: false
# Directory for storing checkpoints
state.checkpoints.dir: file:///tmp/rocksdb/data/
state.checkpoints.num-retained: 100
jobmanager.execution.failover-strategy: region
web.upload.dir: /usr/flink/jars
#metrics reporter
metrics.reporter.promgateway.class: org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter
metrics.reporter.promgateway.host: pushgateway
metrics.reporter.promgateway.port: 9091
metrics.reporter.promgateway.jobName: flink-cluster-job
metrics.reporter.promgateway.randomJobNameSuffix: true
metrics.reporter.promgateway.deleteOnShutdown: true
metrics.reporter.promgateway.groupingKey: k1=v1;k2=v2
metrics.reporter.promgateway.interval: 60 SECONDS
log4j-console.properties: |-
# This affects logging for both user code and Flink
rootLogger.level = INFO
rootLogger.appenderRef.console.ref = ConsoleAppender
rootLogger.appenderRef.rolling.ref = RollingFileAppender
# Uncomment this if you want to _only_ change Flink's logging
#logger.flink.name = org.apache.flink
#logger.flink.level = INFO
# The following lines keep the log level of common libraries/connectors on
# log level INFO. The root logger does not override this. You have to manually
# change the log levels here.
logger.akka.name = akka
logger.akka.level = INFO
logger.kafka.name= org.apache.kafka
logger.kafka.level = INFO
logger.hadoop.name = org.apache.hadoop
logger.hadoop.level = INFO
logger.zookeeper.name = org.apache.zookeeper
logger.zookeeper.level = INFO
# Log all infos to the console
appender.console.name = ConsoleAppender
appender.console.type = CONSOLE
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n
# Log all infos in the given rolling file
appender.rolling.name = RollingFileAppender
appender.rolling.type = RollingFile
appender.rolling.append = false
appender.rolling.fileName = ${sys:log.file}
appender.rolling.filePattern = ${sys:log.file}.%i
appender.rolling.layout.type = PatternLayout
appender.rolling.layout.pattern = %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n
appender.rolling.policies.type = Policies
appender.rolling.policies.size.type = SizeBasedTriggeringPolicy
appender.rolling.policies.size.size=100MB
appender.rolling.strategy.type = DefaultRolloverStrategy
appender.rolling.strategy.max = 10
# Suppress the irrelevant (wrong) warnings from the Netty channel handler
logger.netty.name = org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline
logger.netty.level = OFF
kind: ConfigMap
metadata:
name: jobmanager-flink-conf
namespace: flink-ha
修改taskmanager-flink-conf.yaml 用于taskmanager配置
apiVersion: v1
data:
flink-conf.yaml: |-
jobmanager.rpc.address: flink-cluster
blob.server.port: 6124
jobmanager.rpc.port: 6123
taskmanager.rpc.port: 6122
queryable-state.proxy.ports: 6125
jobmanager.memory.process.size: 3200m
taskmanager.memory.process.size: 10240m
taskmanager.numberOfTaskSlots: 1
parallelism.default: 1
high-availability: zookeeper
high-availability.cluster-id: /flink-cluster
high-availability.storageDir: file:/usr/flink/ha/flink-cluster
high-availability.zookeeper.quorum: 192.168.1.205:2181
# state.backend: filesystem
# state.checkpoints.dir: file:/usr/flink/flink-checkpoints
# state.checkpoints.num-retained: 100
# state.savepoints.dir: hdfs://namenode-host:port/flink-checkpoints
state.backend: rocksdb
state.backend.incremental: false
# Directory for storing checkpoints
state.checkpoints.dir: file:///tmp/rocksdb/data/
state.checkpoints.num-retained: 100
jobmanager.execution.failover-strategy: region
web.upload.dir: /usr/flink/jars
#metrics reporter
metrics.reporter.promgateway.class: org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter
metrics.reporter.promgateway.host: pushgateway
metrics.reporter.promgateway.port: 9091
metrics.reporter.promgateway.jobName: flink-cluster-job
metrics.reporter.promgateway.randomJobNameSuffix: true
metrics.reporter.promgateway.deleteOnShutdown: true
metrics.reporter.promgateway.groupingKey: k1=v1;k2=v2
metrics.reporter.promgateway.interval: 60 SECONDS
log4j-console.properties: |-
# This affects logging for both user code and Flink
rootLogger.level = INFO
rootLogger.appenderRef.console.ref = ConsoleAppender
rootLogger.appenderRef.rolling.ref = RollingFileAppender
# Uncomment this if you want to _only_ change Flink's logging
#logger.flink.name = org.apache.flink
#logger.flink.level = INFO
# The following lines keep the log level of common libraries/connectors on
# log level INFO. The root logger does not override this. You have to manually
# change the log levels here.
logger.akka.name = akka
logger.akka.level = INFO
logger.kafka.name= org.apache.kafka
logger.kafka.level = INFO
logger.hadoop.name = org.apache.hadoop
logger.hadoop.level = INFO
logger.zookeeper.name = org.apache.zookeeper
logger.zookeeper.level = INFO
# Log all infos to the console
appender.console.name = ConsoleAppender
appender.console.type = CONSOLE
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n
# Log all infos in the given rolling file
appender.rolling.name = RollingFileAppender
appender.rolling.type = RollingFile
appender.rolling.append = false
appender.rolling.fileName = ${sys:log.file}
appender.rolling.filePattern = ${sys:log.file}.%i
appender.rolling.layout.type = PatternLayout
appender.rolling.layout.pattern = %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n
appender.rolling.policies.type = Policies
appender.rolling.policies.size.type = SizeBasedTriggeringPolicy
appender.rolling.policies.size.size=100MB
appender.rolling.strategy.type = DefaultRolloverStrategy
appender.rolling.strategy.max = 10
# Suppress the irrelevant (wrong) warnings from the Netty channel handler
logger.netty.name = org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline
logger.netty.level = OFF
kind: ConfigMap
metadata:
name: taskmanager-flink-conf
namespace: flink-ha
以上两个文件中以下内容为修改部分
# state.backend: filesystem
# state.checkpoints.dir: file:/usr/flink/flink-checkpoints
# state.checkpoints.num-retained: 100
# state.savepoints.dir: hdfs://namenode-host:port/flink-checkpoints
state.backend: rocksdb
state.backend.incremental: false
# Directory for storing checkpoints
state.checkpoints.dir: file:///tmp/rocksdb/data/
state.checkpoints.num-retained: 100
kubectl apply -f jobmanager-flink-conf.yaml
kubectl apply -f taskmanager-flink-conf.yaml
apiVersion: apps/v1beta2
kind: StatefulSet
metadata:
name: flink-cluster
namespace: flink-ha
spec:
podManagementPolicy: OrderedReady
replicas: 3
revisionHistoryLimit: 10
selector:
matchLabels:
app: flink-cluster
serviceName: flink-cluster
template:
metadata:
labels:
app: flink-cluster
spec:
containers:
- args:
- jobmanager
env:
- name: TASK_MANAGER_NUMBER_OF_TASK_SLOTS
value: "1"
- name: TZ
value: Asia/Shanghai
image: 192.168.32.14/library/flink:1.11.2
imagePullPolicy: Always
name: flink-cluster
ports:
- containerPort: 6124
name: blob
protocol: TCP
- containerPort: 6125
name: query
protocol: TCP
- containerPort: 8081
name: flink-ui
protocol: TCP
resources: {}
securityContext:
capabilities: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/flink
name: vol1
- mountPath: /opt/flink/conf/
name: vol2
- mountPath: /opt/flink/log
name: vol3
- mountPath: /tmp
name: vol4
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- name: vol1
persistentVolumeClaim:
claimName: cluster-pvc
- configMap:
defaultMode: 420
items:
- key: flink-conf.yaml
mode: 420
path: flink-conf.yaml
- key: log4j-console.properties
mode: 420
path: log4j-console.properties
name: jobmanager-flink-conf
optional: false
name: vol2
- name: vol3
persistentVolumeClaim:
claimName: cluster-log
- name: vol4
persistentVolumeClaim:
claimName: jobmanager-checkpoint-pvc
updateStrategy:
rollingUpdate:
partition: 0
type: RollingUpdate
apiVersion: apps/v1beta2
kind: Deployment
metadata:
generation: 29
name: flink-taskmanager
namespace: flink-ha
spec:
progressDeadlineSeconds: 600
replicas: 3
revisionHistoryLimit: 10
selector:
matchLabels:
app: flink
component: taskmanager
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
labels:
app: flink
component: taskmanager
spec:
containers:
- args:
- taskmanager
env:
- name: TASK_MANAGER_NUMBER_OF_TASK_SLOTS
value: "1"
- name: TZ
value: Asia/Shanghai
image: 192.168.32.14/library/flink:1.11.2
imagePullPolicy: IfNotPresent
name: taskmanager
ports:
- containerPort: 6122
name: rpc
protocol: TCP
- containerPort: 6125
name: query-state
protocol: TCP
resources: {}
securityContext:
capabilities: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/flink
name: vol1
- mountPath: /opt/flink/conf/
name: vol2
- mountPath: /opt/flink/log
name: vol3
- mountPath: /tmp
name: vol4
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- name: vol1
persistentVolumeClaim:
claimName: cluster-pvc
- configMap:
defaultMode: 420
items:
- key: flink-conf.yaml
path: flink-conf.yaml
- key: log4j-console.properties
path: log4j-console.properties
name: taskmanager-flink-conf
optional: false
name: vol2
- name: vol3
persistentVolumeClaim:
claimName: cluster-log
- name: vol4
persistentVolumeClaim:
claimName: taskmanager-checkpoint-pvc
以上文件中修改了以下内容:
增加了挂载点
- mountPath: /tmp
name: vol4
增加了PVC配置
- name: vol4
persistentVolumeClaim:
claimName: taskmanager-checkpoint-pvc
- name: vol4
persistentVolumeClaim:
claimName: jobmanager-checkpoint-pvc
kubectl apply -f jobmanager.yaml
kubectl apply -f taskmanager.yaml
其他部分步骤和Flink 1.11.2 在K8s里基于NFS搭建高可用集群相同