flink on yarn 测试环境搭建
软件版本
hadoop 2.7.3
flink 0.8.0
jdk-8u171-linux-x64
tar -zxvf jdk-8u171-linux-x64.tar.gz
mv jdk1.8.0_171/ /opt/jdk
tar -zxvf hadoop-2.7.3.tar.gz
mv hadoop-2.7.3/ /opt/hadoop
tar -zxvf apache-maven-3.6.3-bin.tar.gz
mv apache-maven-3.6.3/ /opt/maven
vi /opt/maven/conf/settings.xml
vi .bashrc
export JAVA_HOME=/opt/jdk
export M2_HOME=/opt/maven
export HADOOP_HOME=/opt/hadoop
export FLINK_HOME=/opt/flink
export PATH=$PATH:$JAVA_HOME/bin:$M2_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$FLINK_HOME/bin
编译 flink
tar -zxvf flink-0.8.0-src.tgz
cd flink-0.8.0
mvn clean install -DskipTests -Dhadoop.version=2.7.3
复制 flink on yarn 到 /opt 目录
cp flink-dist/target/flink-0.8.0-bin/flink-yarn-0.8.0/ /opt/flink
配置 hdfs
mkdir /opt/namenode
mkdir /opt/datanode
vi /opt/hadoop/etc/hadoop/core-site.xml
vi /opt/hadoop/etc/hadoop/hdfs-site.xml
格式化 hdfs
hdfs namenode -format
启动 namenode secondarynamenode datanode
hadoop-daemon.sh start namenode
hadoop-daemon.sh start secondarynamenode
hadoop-daemon.sh start datanode
配置 yarn
cp /opt/hadoop/etc/hadoop/mapred-site.xml.template /opt/hadoop/etc/hadoop/mapred-site.xml
vi /opt/hadoop/etc/hadoop/mapred-site.xml
vi /opt/hadoop/etc/hadoop/yarn-site.xml
启动 yarn session
yarn-session.sh -jm 2048 -tm 2048 -n 1
15:00:08,071 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15:00:08,214 INFO org.apache.flink.yarn.Utils - Could not find HADOOP_CONF_PATH, using HADOOP_HOME.
15:00:08,214 INFO org.apache.flink.yarn.Utils - Found configuration using hadoop home.
15:00:08,260 INFO org.apache.flink.yarn.Client - Copy App Master jar from local filesystem and add to local environment
15:00:08,852 INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /192.168.14.3:8032
Using values:
Container Count = 1
Jar Path = /opt/flink/lib/flink-dist-0.8.0-yarn-uberjar.jar
Configuration file = /opt/flink/conf/flink-conf.yaml
JobManager memory = 2048
TaskManager memory = 2048
TaskManager cores = 1
amCommand=$JAVA_HOME/bin/java -Xmx1638M -Dlog.file="
15:00:09,107 INFO org.apache.flink.yarn.Utils - Copying from file:/opt/flink/lib/flink-dist-0.8.0-yarn-uberjar.jar to hdfs://localhost:9000/user/root/.flink/application_1588744655039_0008/flink-dist-0.8.0-yarn-uberjar.jar
15:00:10,173 INFO org.apache.flink.yarn.Utils - Copying from /opt/flink/conf/flink-conf.yaml to hdfs://localhost:9000/user/root/.flink/application_1588744655039_0008/flink-conf.yaml
15:00:10,193 INFO org.apache.flink.yarn.Utils - Copying from file:/opt/flink/conf/logback.xml to hdfs://localhost:9000/user/root/.flink/application_1588744655039_0008/logback.xml
15:00:10,219 INFO org.apache.flink.yarn.Utils - Copying from file:/opt/flink/conf/log4j.properties to hdfs://localhost:9000/user/root/.flink/application_1588744655039_0008/log4j.properties
15:00:10,249 INFO org.apache.flink.yarn.Client - Submitting application master application_1588744655039_0008
15:00:10,282 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1588744655039_0008
Flink JobManager is now running on centos1:6131
JobManager Web Interface: http://centos1:8088/proxy/application_1588744655039_0008/
Number of connected TaskManagers changed to 1. Slots available: 1
提交任务
hadoop fs -mkdir /input
hadoop fs -mkdir /output
hadoop fs -put /opt/flink/NOTICE /input
flink run -c org.apache.flink.examples.java.wordcount.WordCount /opt/flink/examples/flink-java-examples-0.8.0-WordCount.jar hdfs:///input/NOTICE hdfs:///output/20200506eFound a yarn properties file (.yarn-properties) file, using "centos1:6131" to connect to the JobManager
05/06/2020 15:37:29: Job execution switched to status RUNNING
05/06/2020 15:37:29: CHAIN DataSource (at getTextDataSet(WordCount.java:141) (org.apache.flink.api.java.io.TextInputFormat)) -> FlatMap (FlatMap at main(WordCount.java:69)) -> Combine(SUM(1), at main(WordCount.java:72) (1/1) switched to SCHEDULED
05/06/2020 15:37:29: CHAIN DataSource (at getTextDataSet(WordCount.java:141) (org.apache.flink.api.java.io.TextInputFormat)) -> FlatMap (FlatMap at main(WordCount.java:69)) -> Combine(SUM(1), at main(WordCount.java:72) (1/1) switched to DEPLOYING
05/06/2020 15:37:29: CHAIN DataSource (at getTextDataSet(WordCount.java:141) (org.apache.flink.api.java.io.TextInputFormat)) -> FlatMap (FlatMap at main(WordCount.java:69)) -> Combine(SUM(1), at main(WordCount.java:72) (1/1) switched to RUNNING
05/06/2020 15:37:29: Reduce (SUM(1), at main(WordCount.java:72) (1/1) switched to SCHEDULED
05/06/2020 15:37:29: Reduce (SUM(1), at main(WordCount.java:72) (1/1) switched to DEPLOYING
05/06/2020 15:37:29: Reduce (SUM(1), at main(WordCount.java:72) (1/1) switched to RUNNING
05/06/2020 15:37:29: CHAIN DataSource (at getTextDataSet(WordCount.java:141) (org.apache.flink.api.java.io.TextInputFormat)) -> FlatMap (FlatMap at main(WordCount.java:69)) -> Combine(SUM(1), at main(WordCount.java:72) (1/1) switched to FINISHED
05/06/2020 15:37:29: DataSink(CsvOutputFormat (path: hdfs:/output/20200506e, delimiter: )) (1/1) switched to SCHEDULED
05/06/2020 15:37:29: DataSink(CsvOutputFormat (path: hdfs:/output/20200506e, delimiter: )) (1/1) switched to DEPLOYING
05/06/2020 15:37:29: DataSink(CsvOutputFormat (path: hdfs:/output/20200506e, delimiter: )) (1/1) switched to RUNNING
05/06/2020 15:37:29: Reduce (SUM(1), at main(WordCount.java:72) (1/1) switched to FINISHED
05/06/2020 15:37:29: DataSink(CsvOutputFormat (path: hdfs:/output/20200506e, delimiter: )) (1/1) switched to FINISHED
05/06/2020 15:37:29: Job execution switched to status FINISHED