安装Hdfs
由于Flink Standalone需要依赖Hdfs,因此需要先安装Hdfs,这里使用hadoop 2.6.5 版本进行安装
下载依赖包
flink 依赖包:https://repo.maven.apache.org/maven2/org/apache/flink/flink-shaded-hadoop-2-uber/2.6.5-10.0/flink-shaded-hadoop-2-uber-2.6.5-10.0.jar
hadoop下载: https://archive.apache.org/dist/hadoop/common/hadoop-2.6.5/
安装
将flink-shaded-hadoop-2-uber-2.6.5-10.0.jar拷贝到/usr/local/flink/lib目录下
解压hadoop-2.6.5,并copy到安装目录
配置环境变量
vi /etc/profile
#增加下面两行
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
配置HDFS,注意这里HDFS也需要使用高可用模式
core-site.xml
fs.defaultFS
hdfs://mycluster
hadoop.tmp.dir
/home/hadoop/tmp
ha.zookeeper.quorum
zetagmaster:2181,zetagworker1:2181,zetagworker2:2181
hdfs-site.xml
dfs.nameservices
mycluster
dfs.ha.namenodes.mycluster
nn1,nn2
dfs.namenode.rpc-address.mycluster.nn1
zetagmaster:8020
dfs.namenode.rpc-address.mycluster.nn2
zetagworker1:8020
dfs.namenode.http-address.mycluster.nn1
zetagmaster:50070
dfs.namenode.http-address.mycluster.nn2
zetagworker1:50070
dfs.namenode.shared.edits.dir
qjournal://zetagmaster:8485;zetagworker1:8485;zetagworker2:8485/mycluster
dfs.ha.fencing.methods
sshfence
dfs.ha.fencing.ssh.private-key-files
/root/.ssh/id_rsa
dfs.journalnode.edits.dir
/home/hadoop/data/jn
dfs.permissions.enable
false
dfs.client.failover.proxy.provider.mycluster
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
dfs.ha.automatic-failover.enabled
true
dfs.namenode.name.dir
/home/hadoop/namenode/data
dfs.datanode.data.dir
/home/hadoop/datanode/data
dfs.replication
2
分别到三台服务器启动QJN
sbin/hadoop-daemon.sh start journalnode
通过scp拷贝hadoop安装目录到集群内所有机器
scp -r /usr/local/hadoop [email protected]:/usr/local/
scp -r /usr/local/hadoop [email protected]:/usr/local/
格式化NameNode并启动
bin/hdfs namenode -format
sbin/hadoop-daemon.sh start namenode
另一个NameNode需要同步第一个再启动
bin/hdfs namenode -bootstrapStandby
sbin/hadoop-daemon.sh start namenode
此时两个NameNode都是standby模式,需要强制转换一个
bin/hdfs haadmin -transitionToActive --forcemanual nn1
##查看一下状态
bin/hdfs haadmin -getServiceState nn1
bin/hdfs haadmin -getServiceState nn2
格式化Zk上的目录
bin/hdfs zkfc -formatZK
通过master启动集群
sbin/start-dfs.sh
启动集群后需要检查一下进程是否都成功启动了
包括
DataNode×3
JournalNode×3
DFSZKFailoverController×2
NameNode×2
如果hdfs正常安装并启动,可以通过webUI查看集群状态
http://zetagmaster:50070
http://zetagworker1:50070
由于切换时需要调用远程ssh以及相关命令,需要在所有服务器安装fuser
yum install -y psmisc
配置flink高可用
修改flink配置
master,增加需要新开jobmanager的主机的地址和端口
zetagmaster:8081
zetagworker1:8081
flink-conf.yaml
high-availability: zookeeper
high-availability.zookeeper.quorum: zetagmaster:2181,zetagworker1:2181,zetagworker2:2181
high-availability.zookeeper.path.root: /flink
high-availability.cluster-id: /default_ns
high-availability.storageDir: hdfs://mycluster/flink/recovery
修改完之后通过scp同步到集群内所有机器
注意这里一定要配置好HADOOP_HOME的环境变量,不然启动会失败!
bin/stop-cluster.sh
bin/start-cluster.sh
重启集群生效