SparkSql 读取文件/读取hdfs文件

SparkSql 读取文件/读取hdfs文件

读取本地:

SparkSql 读取文件/读取hdfs文件_第1张图片
image
    val spark =
      SparkSession.builder()
        .appName("SQL-JSON")
        .master("local[4]")
        .getOrCreate()

    import spark.implicits._

    // easy enough to query flat JSON
    val people = spark.read.json("./data/people.json")
    people.printSchema()
    people.createOrReplaceTempView("people")
    val young = spark.sql("SELECT * FROM people ")
    young.foreach(r => println(r))

    people.select("name").show()

读取hdfs上的文件:

SparkSql 读取文件/读取hdfs文件_第2张图片
image

这两个文件从hdfs配置文件中拿下来放在这里。

object ReadJson {


  def main(args: Array[String]): Unit = {
    val spark =
      SparkSession.builder()
        .appName("SQL-JSON")
        .master("local[4]")
        .getOrCreate()

    import spark.implicits._

    // easy enough to query flat JSON
    val people = spark.read.json("/usr/data/people.json")
    people.printSchema()
    people.createOrReplaceTempView("people")
    val young = spark.sql("SELECT * FROM people ")
    
    young.foreach(r => println(r))
    
    people.select("name").show()

  }
}

你可能感兴趣的:(SparkSql 读取文件/读取hdfs文件)