5.sparkstreaming去kafka中消费Mysql实时操作的数据

1前面Mysql开启了Bin_log,canal实时的去监听然后发送到kafka的example中,现在用Spark-streaming实时的去消费将这些信息打印出来
pom依赖:


            org.apache.kafka
            kafka_2.11
            1.1.0
        
        
            org.apache.spark
            spark-streaming-kafka-0-10_2.11
            2.3.0
            compile
        
         
            com.alibaba.otter
            canal.client
            1.1.3
        

spark-streaming代码如下

object KafkaTest {

  def main(args: Array[String]): Unit = {
    val conf = new SparkConf().setMaster("local[2]").setAppName("Spark-Kafkatest1")
    val ssc = new StreamingContext(conf, Seconds(1))
    val kafkaParam = Map(
      "bootstrap.servers" -> "192.168.240.131:9092",
      "key.deserializer" -> classOf[StringDeserializer],
      "value.deserializer" -> classOf[StringDeserializer],
      "group.id" ->  "con-consumer-group",
      "auto.offset.reset" -> "latest",
      "enable.auto.commit" -> (false: java.lang.Boolean)
    );

    var stream: InputDStream[ConsumerRecord[String, String]] =
      KafkaUtils.createDirectStream[String, String](
        ssc, LocationStrategies.PreferConsistent,
        ConsumerStrategies.Subscribe[String, String](Array( "example"),
          kafkaParam))

    stream.map(s=>("id:"+s.key(),"value:"+s.value())).foreachRDD(
      rdd=>rdd.foreachPartition(
        line=>{
          line.foreach{message=>println(message._1,message._2)}
        }
      )
    )
    ssc.start()
    ssc.awaitTermination()

  }


}

实时监控Mysql中数据库表的变化信息,在mysql中进行操作:
在这里插入图片描述
在这里插入图片描述
我们可以在控制台上实时看到以下信息

(id:null,value:{"data":[{"id":"6","name":"liutao","adress":"hangzhou"}],"database":"fth","es":1556004862000,"id":23,"isDdl":false,
"mysqlType":{"id":"int","name":"varchar(100)","adress":"varchar(100)"},"old":null,"pkNames":null,"sql":"",
"sqlType":{"id":4,"name":12,"adress":12},"table":"user1","ts":1556004862943,"type":"INSERT"})
(id:null,value:{"data":[{"id":"6","name":"liutao","adress":"hangzhou"},{"id":"6","name":"liutao","adress":"hangzhou"}],
"database":"fth","es":1556005117000,"id":25,"isDdl":false,"mysqlType":{"id":"int","name":"varchar(100)","adress":"varchar(100)"},
"old":null,"pkNames":null,"sql":"","sqlType":{"id":4,"name":12,"adress":12},"table":"user1","ts":1556005117173,"type":"DELETE"})

实际项目中,参数都在配置文件中写好,这里就是做一个测试。在spark-streaming中还要做offset的维护,我前面的公司是将offset放在mysql中维护的

你可能感兴趣的:(数仓)