【Spark】DStream转DataFrame

  1. 使用split(",")分割数据,前提是你的数据是以逗号分隔的;
  2. 分隔后得到Array,根据索引获取对应的值,且一定要转换为数据表对应字段的数据类型;
  3. toDF(),里面填写表的字段名
  4. saveToPhoenix() 这个是通过Phoenix保存到HBase的操作
val stream = context("heatData")
val sqlContext = sparkSession.sqlContext
import org.apache.phoenix.spark._
import sqlContext.implicits._
stream.foreachRDD(rdd => {
 rdd.map(_.value().split(","))
   .map(item => (
     item(0).toInt,
     item(1).toInt,
     item(2).toDouble,
     item(3).toDouble,
     item(4).toDouble,
     item(5).toLong) )
   .toDF("ID", "CONSUMERID", "TEMPERATURE", "PRESSURE", "FLOW", "CREATEDATE")
   .saveToPhoenix(Map("table" -> "heatData", "zkUrl" -> "hbase:2181"))

你可能感兴趣的:(大数据,SparkSQL,SparkStreaming)