部署模式一:
standalone模式:
规划:185-187三台centos7主机,185作为JobManager,186和187作为TastManager
1.下载flink-1.7.1-bin-scala_2.12.tgz,不要下带hadoop依赖库版本。
2.解压
3.配置flink-1.7.1/conf/flink-conf.yaml,配置说明:
##################################
jobmanager.rpc.address: 162.168.1.185
# The RPC port where the JobManager is reachable.
jobmanager.rpc.port: 6123
# The heap size for the JobManager JVM
jobmanager.heap.size: 10240m
# The heap size for the TaskManager JVM
taskmanager.heap.size: 10240m
# The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.每个TastManager插槽数
taskmanager.numberOfTaskSlots: 12
# The parallelism used for programs that did not specify and other parallelism.默认并行度
parallelism.default: 12
#临时目录
io.tmp.dirs: /home/flink/tmp
4.配置flink-1.7.1/conf/master,就是JobManager节点IP/HOST : port
162.168.1.185:8089
5.配置flink-1.7.1/conf/slaves,就是TaskManager节点,worker节点
162.168.1.186 162.168.1.187
6.将flink-1.7.1文件夹远程拷贝到TaskManager节点,就是worker节点,目录要一致
scp -r flink-1.7.1 162.168.1.186:/home/flink/ scp -r flink-1.7.1 162.168.1.187:/home/flink/
7.配置flink用户互信
三台机器依次在flink用户下执行:
ssh-keygen -t rsa ssh-copy-id -i ~/.ssh/id_rsa.pub 162.168.1.186 ssh-copy-id -i ~/.ssh/id_rsa.pub 162.168.1.187 ssh-copy-id -i ~/.ssh/id_rsa.pub 162.168.1.185
确保每台机器的公钥分发到其他两台上
8.在185上执行flink-1.7.1/bin/start-cluster.sh
9.在浏览器查看162.168.1.185:8089,可以查看到集群信息
部署模式二:
高可用集群模式,Hign Awailability
优点:之前部署的是一个JobManager单节点的独立集群,一旦该节点挂掉,节点将无法使用,现在启动多个JobManager节点,并用Zookeeper管理选举leader来实现高容错,高度可用
注意:要下载集成hadoop依赖的压缩版本,而且要与hadoop集群版本对应
1.首先需要一个Zookeeper集群,三个节点足够
2.需要hdfs文件系统,作为数据存储
3.配置flink-conf.yaml文件,修改一些配置:
#################################
jobmanager.rpc.address: localhost
# The RPC port where the JobManager is reachable.
jobmanager.rpc.port: 6123
# The heap size for the JobManager JVM
jobmanager.heap.size: 10240m
# The heap size for the TaskManager JVM
taskmanager.heap.size: 10240m
# The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.
taskmanager.numberOfTaskSlots: 12
# The parallelism used for programs that did not specify and other parallelism.
parallelism.default: 12
# The high-availability mode. Possible options are 'NONE' or 'zookeeper'.
# The high-availability mode. Possible options are 'NONE' or 'zookeeper'.
#
high-availability: zookeeper
# The path where metadata for master recovery is persisted. While ZooKeeper stores
# the small ground truth for checkpoint and leader election, this location stores
# the larger objects, like persisted dataflow graphs.
#
# Must be a durable file system that is accessible from all nodes
# (like HDFS, S3, Ceph, nfs, ...)
#
high-availability.storageDir: hdfs://fulltext-linux185:9001/home/flink/ha/
# The list of ZooKeeper quorum peers that coordinate the high-availability
# setup. This must be a list of the form:
# "host1:clientPort,host2:clientPort,..." (default clientPort: 2181)
#
high-availability.zookeeper.quorum: 162.168.1.185:2181,162.168.1.186:2181,162.168.1.187:2181
state.backend: filesystem
# Directory for checkpoints filesystem, when using any of the default bundled
# state backends.
#
state.checkpoints.dir: hdfs://fulltext-linux185:9001/home/flink/flink-checkpoints
# Default target directory for savepoints, optional.
#
state.savepoints.dir: hdfs://fulltext-linux185:9001/home/flink/flink-checkpoints
jobmanager.archive.fs.dir: hdfs://fulltext-linux185:9001/home/flink/completed-jobs/
# The address under which the web-based HistoryServer listens.
#historyserver.web.address: 0.0.0.0
# The port under which the web-based HistoryServer listens.
#historyserver.web.port: 8082
# Comma separated list of directories to monitor for completed jobs.
historyserver.archive.fs.dir: hdfs://fulltext-linux185:9001/home/flink/completed-jobs/
# Interval in milliseconds for refreshing the monitored directories.
historyserver.archive.fs.refresh-interval: 10000
4.配置flink-1.7.1/conf/master,增加几个备用JobManager节点:
############################### fulltext-linux185:8089 fulltext-linux186:8089
5.配置flink-1.7.1/conf/slaves,由于机器紧张,只能把TastManager节点和JobManager节点放到一台机器上了
################################## fulltext-linux185 fulltext-linux186 fulltext-linux187
6.启动zookeeper集群
7.启动hdfs集群
8.启动flink集群,
########### ./start-cluster.sh
9.浏览器访问webUI,162.168.1.185:8089,会自动跳转到jobmanager节点的服务,不一定是185,也可能是186,由zookeeper选举的leader
10.验证高可用,杀掉当前jobmanager进程,等几秒zookeeper会自动启用备用JobManager节点
########### [flink@fulltext-linux185 bin]$ jps 32837 Jps 32793 StandaloneSessionClusterEntrypoint 30748 TaskManagerRunner [flink@fulltext-linux185 bin]$ kill -9 32837
#然后在web端无法访问185:8089,但过了不到10秒,访问186:8089端口即可发现,JobManager也变成了186
#重启JobManager命令:
[flink@fulltext-linux185 bin]$ ./jobmanager.sh start fulltext-linux185 8089
Starting standalonesession daemon on host fulltext-linux185.
部署模式三:高可用Flink on Yarn,只要在一台机器上部署即可,类似与spark