SparkSQL联接操作

 

1.数据准备

本文主要介绍 Spark SQL 的多表连接,需要预先准备测试数据。分别创建员工和部门的 Datafame,并注册为临时视图,代码如下:

val spark = SparkSession.builder().appName("aggregations").master("local[2]").getOrCreate()

val empDF = spark.read.json("/data/file/json/emp.json")
empDF.createOrReplaceTempView("emp")

val deptDF = spark.read.json("/data/file/json/dept.json")
deptDF.createOrReplaceTempView("dept")

edmp 内容如下:

{"EMPNO": 7369,"ENAME": "SMITH","JOB": "CLERK","MGR": 7902,"HIREDATE": "1980-12-17 00:00:00","SAL": 800.00,"COMM": null,"DEPTNO": 20}
{"EMPNO": 7499,"ENAME": "ALLEN","JOB": "SALESMA

你可能感兴趣的:(计算引擎,Spark,spark,sql,大数据)