1.向kafka对应的主题发送数据工具类代码
import java.util.{HashMap, Map, Properties}
import org.apache.kafka.clients.producer.{KafkaProducer, ProducerRecord}
import scala.io.Source
/**
* @Author: ch
* @Date: 07/05/2020 12:20 AM
* @Version 1.0
* @Describe: 发送数据到kafka对应的主题中
*/
object KafkaUtils {
val broker_list = "localhost:9092"
val topic = "test" // kafka topic,Flink 程序中需要和这个统一
def writeToKafka(point:String,producer: KafkaProducer[String, String]): Unit = {
val record = new ProducerRecord[String, String](topic, null, null, point)
producer.send(record)
System.out.println("发送数据: " + point)
producer.flush()
}
def main(args: Array[String]): Unit = {
val props = new Properties
props.put("bootstrap.servers", broker_list)
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer") //key 序列化
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer") //value 序列化
val producer: KafkaProducer[String, String] = new KafkaProducer[String, String](props)
Thread.sleep(300)
val source = Source.fromFile("/Users/zytshijack/Documents/github/git/myrepositories/flink110test/src/main/resources/file/randompoint10MB.txt")
val lineIterator = source.getLines
for (line <- lineIterator){
writeToKafka(line,producer) //将文件的每一行发送到kafka中
}
}
}
这里如果不把producer当做参数传入的话会产生
Caused by: org.apache.kafka.common.KafkaException: java.io.IOException: Too many open files
的错误
2.消费kafka对应主题的消息
2.1pom.xml依赖
org.apache.flink
flink-connector-kafka-0.11_2.11
1.10.0
2.2示例代码
import java.util.Properties
import org.apache.flink.api.common.serialization.SimpleStringSchema
import org.apache.flink.streaming.api.datastream.DataStreamSource
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer011
/**
* @Author: ch
* @Date: 07/05/2020 12:44 AM
* @Version 1.0
* @Describe: 测试从kafka对应的主题读取数据
*/
object KafkaReadTest {
def main(args: Array[String]): Unit = {
val env: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment
val props = new Properties
props.put("bootstrap.servers", "localhost:9092")
props.put("zookeeper.connect", "localhost:2181")
props.put("group.id", "metric-group")
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer") //key 反序列化
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
props.put("auto.offset.reset", "latest") //value 反序列化
val dataStreamSource: DataStreamSource[String] = env
.addSource(new FlinkKafkaConsumer011[String]("test", //kafka topic
new SimpleStringSchema, // String 序列化
props))
.setParallelism(1)
dataStreamSource.print //把从 kafkaUtils中发出的kafka数据 读取到的数据打印在控制台
env.execute("Flink add data source")
}
}
3.运行测试
使用mac安装kafka,相关操作命令查看博客:https://segmentfault.com/a/11...
3.1进入kafka/bin目录
略
3.2启动zookeeper
sh zookeeper-server-start /usr/local/etc/kafka/zookeeper.properties &
3.3启动kafka
sh kafka-server-start /usr/local/etc/kafka/server.properties &
3.4创建名为test的主题
sh kafka-topics --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
3.5查看主题是否创建成功
sh kafka-topics --list --zookeeper localhost:2181
3.6运行KafkaUtils类将文件数据发送到test主题
可以看到控制台不断打印产生的数据
3.7运行KafkaReadTest类消费数据
可以看到控制台不断打印消费的数据
3.8关闭kafka
sh kafka-server-stop
3.9关闭zookeeper
sh zookeeper-server-stop