DataFrame与DataSet的互操作

1. DataFrame转换为DataSet

1)创建一个DateFrame

scala> val df = spark.read.json("examples/src/main/resources/people.json")

df: org.apache.spark.sql.DataFrame = [age: bigint, name: string]

2)创建一个样例类

scala> case class Person(name: String, age: Long)

defined class Person

3)将DateFrame转化为DataSet

scala> df.as[Person]

res14: org.apache.spark.sql.Dataset[Person] = [age: bigint, name: string]

2. DataSet转换为DataFrame

1)创建一个样例类

scala> case class Person(name: String, age: Long)

defined class Person

2)创建DataSet

scala> val ds = Seq(Person("Andy", 32)).toDS()

ds: org.apache.spark.sql.Dataset[Person] = [name: string, age: bigint]

3)将DataSet转化为DataFrame

scala> val df = ds.toDF

df: org.apache.spark.sql.DataFrame = [name: string, age: bigint]

4)展示

scala> df.show

+----+---+

|name|age|

+----+---+

|Andy| 32|

+----+---+

你可能感兴趣的:(Spark)