简介:Flink作为目前最火的流式计算平台,近年来发展迅速,吸引了开发者的青睐,接下来我和大家一起了解其中的奥秘。俗话说万事开头难,咱们就从搭建环境开始入手吧!
- 环境配置。Flink是用Java开发的,所以需要Java开发需要的环境,如:jdk、maven、idea等,这些工具作为Java工程师来说并不难,这里不再累述。
- 源码编译。下载地址https://flink.apache.org/downloads.html,笔者目前使用的是1.13.1版本,大家去官网下载最新的即可。加载完导入idea,注释掉没用的编译插件,如:maven-checkstyle-plugin,然后确保setting配置正确,否则会出现错误Could not transfer metadata xxxxx from/to xxxxx,意思是有些依赖拉取不到,这种情况只需要根据依赖去maven仓库中找,然后确保在仓库中可以找到。笔者刚开始编译时出现了一个问题就是因为本地maven settings配置了公司的私服,然后mirrorOf配置的是*,所以flink这子模块单独设置的仓库就没发用了,某些特殊的依赖在私服中又找不到,所以编译出现拉取不到某些包,后面在编译的时候指定了settings文件之后可以了,例如:mvn clean install --settings xxx。编译大约需要半个小时左右,大家需要耐心等待,中途可能出现错误,可能是网络原因,重试之后就没事了。
- 查看编译后的程序是否能正常使用。编译后的程序放在flin-dist中的target/flink-1.13.1-bin下,可以启动目录下的单机模式的集群来测试。
- 构建一个开发项目。
a. quickstart方式构建
java项目构建命令:
mvn archetype:generate
-DarchetypeGroupId=org.apache.flink
-DarchetypeArtifactId=flink-quickstart-java
-DarchetypeVersion=1.13.1
scala项目构建命令:
mvn archetype:generate
-DarchetypeGroupId=org.apache.flink
-DarchetypeArtifactId=flink-quickstart-scala
-DarchetypeVersion=1.13.1
b. Flink1.10之后的版本也可以根据官方给的样例构造
java项目构建命令:
mvn archetype:generate
-DarchetypeGroupId=org.apache.flink
-DarchetypeArtifactId=flink-walkthrough-datastream-java
-DarchetypeVersion=1.13.0
-DgroupId=frauddetection
-DartifactId=frauddetection
-Dversion=0.1
-Dpackage=spendreport
-DinteractiveMode=false
scala项目构建命令:
mvn archetype:generate
-DarchetypeGroupId=org.apache.flink
-DarchetypeArtifactId=flink-walkthrough-datastream-scala
-DarchetypeVersion=1.13.0
-DgroupId=frauddetection
-DartifactId=frauddetection
-Dversion=0.1
-Dpackage=spendreport
-DinteractiveMode=false
说明:walkthrough【代码走查】模块,主要提供了一个准备好的 Flink Maven Archetype 能够快速创建一个包含了必要依赖的 Flink 程序骨架,基于此,你可以把精力集中在编写业务逻辑上即可。 这些已包含的依赖包括 flink-streaming-java、flink-walkthrough-common 等,他们分别是 Flink 应用程序的核心依赖项和这个代码练习需要的数据生成器,当然还包括其他本代码练习所依赖的类。
c. 项目导入idea时可能出现错误
Exception in thread "main" java.lang.NoClassDefFoundError: scala/Predef$
at com.sht.flink.AccessLog$.main(AccessLog.scala:5)
at com.sht.flink.AccessLog.main(AccessLog.scala)
Caused by: java.lang.ClassNotFoundException: scala.Predef$
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 2 more
只需要注释掉相关依赖的
构建过程日志如下:
- maven-shade-plugin打包问题。
Flink uses java Service Provider to discover Source/Sink connector. Without this transformer, you will 100% encoutner "org.apache.flink.table.api.NoMatchingTableFactoryException: Could not find a suitable table factory", which happened on me.
- Java项目pom文件:
4.0.0
com.dpf.flink
flink-cdc
1.0-SNAPSHOT
jar
Flink Quickstart Job
UTF-8
1.13.1
1.4.0
1.2.62
1.8
2.11
${target.java.version}
${target.java.version}
2.12.1
apache.snapshots
Apache Development Snapshot Repository
https://repository.apache.org/content/repositories/snapshots/
false
true
org.apache.flink
flink-java
${flink.version}
org.apache.flink
flink-streaming-java_${scala.binary.version}
${flink.version}
org.apache.flink
flink-clients_${scala.binary.version}
${flink.version}
org.apache.flink
flink-table-api-java-bridge_${scala.binary.version}
${flink.version}
org.apache.flink
flink-table-api-scala-bridge_${scala.binary.version}
${flink.version}
org.apache.flink
flink-table-planner_${scala.binary.version}
${flink.version}
org.apache.flink
flink-table-planner-blink_${scala.binary.version}
${flink.version}
org.apache.flink
flink-table-common
${flink.version}
org.apache.flink
flink-connector-kafka_${scala.binary.version}
${flink.version}
com.alibaba.ververica
flink-connector-mysql-cdc
${cdc.version}
org.apache.flink
flink-connector-jdbc_${scala.binary.version}
${flink.version}
org.apache.flink
flink-walkthrough-common_${scala.binary.version}
${flink.version}
org.apache.flink
flink-json
${flink.version}
com.alibaba
fastjson
${fastjson.version}
org.apache.logging.log4j
log4j-slf4j-impl
${log4j.version}
runtime
org.apache.logging.log4j
log4j-api
${log4j.version}
runtime
org.apache.logging.log4j
log4j-core
${log4j.version}
runtime
org.apache.maven.plugins
maven-compiler-plugin
3.1
${target.java.version}
org.apache.maven.plugins
maven-shade-plugin
3.1.1
package
shade
org.apache.flink:force-shading
com.google.code.findbugs:jsr305
org.slf4j:*
org.apache.logging.log4j:*
*:*
META-INF/*.SF
META-INF/*.DSA
META-INF/*.RSA
com.dpf.flink.StreamingJob
org.eclipse.m2e
lifecycle-mapping
1.0.0
org.apache.maven.plugins
maven-shade-plugin
[3.1.1,)
shade
org.apache.maven.plugins
maven-compiler-plugin
[3.1,)
testCompile
compile
- Scala项目pom文件:
4.0.0
com.sht.flink
flink-sql
1.0-SNAPSHOT
jar
Flink Quickstart Job
apache.snapshots
Apache Development Snapshot Repository
https://repository.apache.org/content/repositories/snapshots/
false
true
UTF-8
1.13.1
1.8
2.11
2.11.12
2.12.1
org.apache.flink
flink-scala_${scala.binary.version}
${flink.version}
provided
org.apache.flink
flink-streaming-scala_${scala.binary.version}
${flink.version}
provided
org.apache.flink
flink-clients_${scala.binary.version}
${flink.version}
provided
org.scala-lang
scala-library
${scala.version}
org.apache.logging.log4j
log4j-slf4j-impl
${log4j.version}
runtime
org.apache.logging.log4j
log4j-api
${log4j.version}
runtime
org.apache.logging.log4j
log4j-core
${log4j.version}
runtime
org.apache.maven.plugins
maven-shade-plugin
3.1.1
package
shade
org.apache.flink:force-shading
com.google.code.findbugs:jsr305
org.slf4j:*
org.apache.logging.log4j:*
*:*
META-INF/*.SF
META-INF/*.DSA
META-INF/*.RSA
com.sht.flink.StreamingJob
org.apache.maven.plugins
maven-compiler-plugin
3.1
${target.java.version}
net.alchim31.maven
scala-maven-plugin
3.2.2
compile
testCompile
-nobootcp
-target:jvm-${target.java.version}
org.apache.maven.plugins
maven-eclipse-plugin
2.8
true
org.scala-ide.sdt.core.scalanature
org.eclipse.jdt.core.javanature
org.scala-ide.sdt.core.scalabuilder
org.scala-ide.sdt.launching.SCALA_CONTAINER
org.eclipse.jdt.launching.JRE_CONTAINER
org.scala-lang:scala-library
org.scala-lang:scala-compiler
**/*.scala
**/*.java
org.codehaus.mojo
build-helper-maven-plugin
1.7
add-source
generate-sources
add-source
add-test-source
generate-test-sources
add-test-source