Spark 之 Datasets 创建方式

创建Datasets 的三种方式
  • 由DataFrame 转化成为 Dataset
  • 通过 SparkSession.createDataset() 直接创建
  • 通过toDS 方法意识转换
案例一: 由DataFrame 转化成为 Dataset
 val spark = SparkSession.builder().config(conf).getOrCreate();
 import spark.implicits._;
 val df = spark.createDataFrame(List(Person("Jason",34,"DBA"),Person("Tom",20,"Dev"))).toDF("name","age","job");
val ds = df.as[Person];
ds.show();
spark.close();
输出日志:
+-----+---+---+
| name|age|job|
+-----+---+---+
|Jason| 34|DBA|
|  Tom| 20|Dev|
+-----+---+---+
案例二: 通过 SparkSession.createDataset() 直接创建
val spark = SparkSession.builder().config(conf).getOrCreate();
import spark.implicits._;
val ds = spark.createDataset(List(Person("Jason",34,"DBA"),Person("Tom",20,"Dev")));
ds.show();
输出日志:
+-----+---+---+
| name|age|job|
+-----+---+---+
|Jason| 34|DBA|
|  Tom| 20|Dev|
+-----+---+---+
案例三: 通过 SparkSession.createDataset() 直接创建
val spark = SparkSession.builder().config(conf).getOrCreate();
import spark.implicits._;
val ds = List(Person("Jason",34,"DBA"),Person("Tom",20,"Dev")).toDS();
ds.show();
输出日志:
+-----+---+---+
| name|age|job|
+-----+---+---+
|Jason| 34|DBA|
|  Tom| 20|Dev|
+-----+---+---+

你可能感兴趣的:(Spark,Spark)