Flink java wordcount

Flnk java wordcount

  • 前言
    • 项目的目录结构
    • pom文件
    • WindowWordCount.java
    • helloword.txt文件
    • 运行结果
    • 注意

前言

各位好,欢迎浏览我的博客,后面将持续更新小编在flink上学习的心得体会,希望越来越多的新同学加入到这个行列中。
本次小编开发是flink的入门程序 wordcount

项目的目录结构

本项目就是一个简答的spring boot的项目,
LiuUtilApplication:是spring boot的项目启动类。
WindowWordCount:是flink实现单词统计的主要功能。
helloword.txt:是一个文本文件,里面都是汉子以空格分隔。
Flink java wordcount_第1张图片

pom文件

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.4.0</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <groupId>com.liulangtao.flink</groupId>
    <artifactId>liu-util</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>liu-util</name>
    <description>Demo project for Spring Boot</description>

    <properties>
        <java.version>1.8</java.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter</artifactId>
            <exclusions>
                <exclusion>
                    <groupId>ch.qos.logback</groupId>
                    <artifactId>logback-classic</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-core</artifactId>
            <version>1.11.2</version>
        </dependency>

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-java</artifactId>
            <version>1.11.2</version>
        </dependency>


        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-streaming-java_2.11</artifactId>
            <version>1.11.2</version>
        </dependency>

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-clients_2.12</artifactId>
            <version>1.11.2</version>
        </dependency>


        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-streaming-scala_2.11</artifactId>
            <version>1.11.2</version>
        </dependency>

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-runtime-web_2.12</artifactId>
            <version>1.11.2</version>
        </dependency>


        <!--<dependency>-->
            <!--<groupId>org.apache.flink</groupId>-->
            <!--<artifactId>flink-connector-kafka_2.12</artifactId>-->
            <!--<version>1.11.2</version>-->
        <!--</dependency>-->


        <!--<dependency>-->
            <!--<groupId>org.apache.flink</groupId>-->
            <!--<artifactId>flink-table-common</artifactId>-->
            <!--<version>1.11.2</version>-->
        <!--</dependency>-->

        <!--#flink-streaming-java_2.11-->
        <!--<dependency>-->
            <!--  <groupId>org.apache.flink</groupId>-->
            <!--  <artifactId>flink-streaming-java_2.11</artifactId>-->
            <!--  <version>1.11.1</version>-->
            <!--  <scope>provided</scope>-->
        <!--</dependency>-->
        <!--#flink-scala_2.11-->
        <!--<dependency>-->
            <!--  <groupId>org.apache.flink</groupId>-->
            <!--  <artifactId>flink-scala_2.11</artifactId>-->
            <!--  <version>1.11.1</version>-->
            <!--  <scope>provided</scope>-->
        <!--</dependency>-->
        <!--#flink-streaming-scala_2.11-->
        <!--<dependency>-->
            <!--  <groupId>org.apache.flink</groupId>-->
            <!--  <artifactId>flink-streaming-scala_2.11</artifactId>-->
            <!--  <version>1.11.1</version>-->
            <!--  <scope>provided</scope>-->
        <!--</dependency>-->

        <!--#flink-streaming-scala_2.11-->
        <!--#flink-connector-kafka-0.10_2.11-->
        <!--<dependency>-->
        <!--    <groupId>org.apache.flink</groupId>-->
        <!--    <artifactId>flink-connector-kafka-0.10_2.11</artifactId>-->
        <!--    <version>1.11.1</version>-->
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>

</project>

WindowWordCount.java

public class WindowWordCount {

    public static void main(String[] args) throws Exception {

        ExecutionEnvironment executionEnvironment =ExecutionEnvironment.getExecutionEnvironment();
       String path="E:\\4-openProject\\20-flink\\src\\main\\resources\\helloword.txt";
        DataSet<String> stringDataSource = executionEnvironment
                .readTextFile(path);
        DataSet< Tuple2<String, Integer>> set =
                stringDataSource.flatMap(new myFlatMap())
        .groupBy(0)
       .sum(1);
        set.print();
    }
    public  static  class myFlatMap implements FlatMapFunction<String, Tuple2<String,Integer>> {
        @Override
        public void flatMap(String s, Collector<Tuple2<String, Integer>> collector) throws Exception {
          String [] words=  s.split(" ");
            for ( String word: words) {
                collector.collect(new Tuple2<>(word,1));
            }
        }
    }
}

helloword.txt文件

hello word
hellp zhangsan
hello wangwu
hellp  lisi
hello 123

运行结果

Flink java wordcount_第2张图片

注意

1.pom文件里面 spring-boot-starter里面要把ch.qos.logback日志插件去掉,不然会打很多日志。

org.springframework.boot
spring-boot-starter


ch.qos.logback
logback-classic



2.不去掉日志 会有下面的报错,暂时由于刚学习还不确定报错的原因

org.apache.flink.runtime.taskexecutor.slot.SlotNotFoundException: Could not find slot for fecc2c52cd4abfe0b0c7400fce272eee.

初步的认为是spring boot 没有绑定flink数据源,导致springboot
跑flink的程序时候,会查询到flink的节点已经停止

Flink java wordcount_第3张图片

你可能感兴趣的:(Flnk,java,wordcount,flink)