sparksql练习

题目:

 

-------学生表

 //学号//学生姓名//学生性别 //学生出生年月//学生所在班级

 

--------课程表

 

//课程号//课程名称//教工编号

   

------成绩表

 

//学号(外键)//课程号(外键)//成绩

 

----教师表

 

 //教工编号(主键)//教工姓名//教工性别//教工出生年月//职称//部门

 

 

1、 查询Student表中的所有记录的Sname、Ssex和Class列。

2、查询教师所有的单位即不重复的Depart列。

3、查询Student表的所有记录

4、 查询Score表中成绩在60到80之间的所有记录。

5、 查询Score表中成绩为85,86或88的记录。

6、 查询Student表中“95031”班或性别为“女”的同学记录。

7、 以Class降序,升序查询Student表的所有记录。

8、 以Cno升序、Degree降序查询Score表的所有记录。

9、 查询“95031”班的学生。

10、 查询Score表中的最高分的学生学号和课程号。(子查询或者排序)

11、 查询每门课的平均成绩。

12、查询Score表中至少有5名学生选修的并以3开头的课程的平均分数。
13、查询分数大于70,小于90的Sno列。

14、查询所有学生的Sname、Cno和Degree列。

15、查询所有学生的Sno、Cname和Degree列。

16、查询所有学生的Sname、Cname和Degree列。

17、 查询“95033”班学生的平均分。

19、  查询选修“3-105”课程的成绩高于“109”号同学成绩的所有同学的记录。

20、查询score中选学多门课程的同学中分数为非最高分成绩的记录。

21、 查询成绩高于学号为“109”、课程号为“3-105”的成绩的所有记录。

22、查询和学号为105的同学同年出生的所有学生的Sno、Sname和Sbirthday列。

23、查询“张旭“教师任课的学生成绩。

24、查询选修某课程的同学人数多于4人的教师姓名。

25、查询95033班和95031班全体学生的记录。

26、  查询存在有85分以上成绩的课程Cno.

27、查询出“计算机系“教师所教课程的成绩表。

28、查询“计算机系”与“电子工程系“不同职称的教师的Tname和Prof。

29、查询选修编号为“3-105“课程且成绩至少高于选修编号为“3-245”的同学的Cno、Sno和Degree,并按Degree从高到低次序排序。

30、查询选修编号为“3-105”且成绩高于选修编号为“3-245”课程的同学的Cno、Sno和Degree.

31、 查询所有教师和同学的name、sex和birthday.

32、查询所有“女”教师和“女”同学的name、sex和birthday. union

33、 查询成绩比该课程平均成绩低的同学的成绩表。

34、 查询所有任课教师的Tname和Depart. in

35 、 查询所有未讲课的教师的Tname和Depart. not in

36、查询至少有2名男生的班号。  group by, having count

37、查询Student表中不姓“王”的同学记录。 not like

38、查询Student表中每个学生的姓名和年龄。

将函数运用到spark sql中去计算,可以直接拿String的类型计算不需要再转换成数值型 默认是会转换成Double类型计算

浮点型转整型

39、查询Student表中最大和最小的Sbirthday日期值。 时间格式最大值,最小值

40、以班号和年龄从大到小的顺序查询Student表中的全部记录。 查询结果排序

41、查询“男”教师及其所上的课程。 select join

42、查询最高分同学的Sno、Cno和Degree列。 子查询

43、查询和“李军”同性别的所有同学的Sname.

44、查询和“李军”同性别并同班的同学Sname.

45、查询所有选修“计算机导论”课程的“男”同学的成绩表。

代码:

Score和Course使用了样例类的方式(个人比较喜欢-------代码简洁)

case class Score(sid: String, scid: String, sscore: Double)

case class Course(cid: String, cname: String, tid: String)

object SparkSqlTest {
  def main(args: Array[String]): Unit = {
    val sparkSession = SparkSession.builder().master("local").getOrCreate()
    val sparkContext = sparkSession.sparkContext
    val studentRDD = sparkContext.textFile("hdfs://master:9000/input/Student").map {
      lines =>
        val line = lines.split(",")
        Row(line(0), line(1), line(2), line(3), line(4))
    }
    val studentSchema: StructType = StructType(mutable.ArraySeq(
      StructField("sid", StringType, nullable = false),
      StructField("sname", StringType, nullable = false),
      StructField("ssex", StringType, nullable = false),
      StructField("sbirthday", StringType, nullable = false),
      StructField("sclass", StringType, nullable = false)
    ))
    val studentDF = sparkSession.createDataFrame(studentRDD, studentSchema)
    studentDF.createTempView("student")

    val teacherRDD = sparkContext.textFile("hdfs://master:9000/input/Teacher").map {
      lines =>
        val line = lines.split(",")
        Row(line(0), line(1), line(2), line(3), line(4), line(5))
    }
    val teacherSchema: StructType = StructType(mutable.ArraySeq(
      StructField("tid", StringType, nullable = false),
      StructField("tname", StringType, nullable = false),
      StructField("tsex", StringType, nullable = false),
      StructField("tbirthday", StringType, nullable = false),
      StructField("ttitle", StringType, nullable = false),
      StructField("tdepart", StringType, nullable = false)
    ))
    val teacherDF = sparkSession.createDataFrame(teacherRDD, teacherSchema)
    teacherDF.createTempView("teacher")

    val scoreRDD = sparkContext.textFile("hdfs://master:9000/input/Score").map(line => line.split(",")).map(x => Score(x(0), x(1), x(2).toDouble))
    import sparkSession.implicits._
    val scoreDF = scoreRDD.toDF()
    scoreDF.createTempView("score")

    val courseRDD = sparkContext.textFile("hdfs://master:9000/input/Course").map(line => line.split(",")).map(x => Course(x(0), x(1), x(2)))
    val courseDF = courseRDD.toDF()
    courseDF.createTempView("course")



    //    sparkSession.sql("select sname,ssex,sclass from student").show()
    //    sparkSession.sql("select tdepart from teacher group by tdepart").show()
    //    sparkSession.sql("select * from student").show()
    //    sparkSession.sql("select * from score where sscore > 60 and sscore < 80").show()
    //    sparkSession.sql("select * from score where sscore in (85,86,88)").show()
    //    sparkSession.sql("select * from student where ssex = 'female' or sclass = '95031'").show()
    //    sparkSession.sql("select * from student order by sclass desc").show()
    //    sparkSession.sql("select * from student where sclass = '95031'").show()
    //    sparkSession.sql("select * from score order by scid").show()
    //    sparkSession.sql("select * from score order by sscore desc").show()
    //    sparkSession.sql("select s.sid,c.cid from student s,course c,score sc where s.sid = sc.sid and sc.scid = c.cid order by sc.sscore desc limit 1").show()
    //    sparkSession.sql("select scid,avg(sscore) from score group by scid").show()
    //    sparkSession.sql("select scid,count(score.scid) c,sum(sscore) s from score where scid like '3%' group by scid").show()
    //    sparkSession.sql("select a.scid scid,a.s/a.c average from (select scid scid,count(score.scid) c,sum(sscore) s from score where scid like '3%' group by scid) a where a.c > 4").show()
    //    sparkSession.sql("select sid from score where sscore > 70 and sscore < 90").show()
    //    sparkSession.sql("select student.sname,course.cid,score.sscore from student,score,course where student.sid = score.sid and score.scid = course.cid").show()
    //    sparkSession.sql("select student.sid,course.cname,score.sscore from student,score,course where student.sid = score.sid and score.scid = course.cid").show()
    //    sparkSession.sql("select student.sname,course.cname,score.sscore from student,score,course where student.sid = score.sid and score.scid = course.cid").show()
    //    sparkSession.sql("select student.sclass,avg(score.sscore) from student,score where student.sid = score.sid and student.sclass = '95033' group by student.sclass").show()
    //    sparkSession.sql("select student.sname,score.sscore from course,score,student,(select sscore s from score where sid = '109') a where student.sid = score.sid and course.cid = score.scid and course.cid = '3-105' and score.sscore > a.s").show()
    //    sparkSession.sql("select * from score a where sid in (select sid from score group by sid having count(sid)>1) and a.sscore < (select max(sscore) from score b where a.scid = b.scid)").show()
    //    sparkSession.sql("select * from score where scid = '3-105' and sscore > (select sscore s from score where sid = '109')").show()
    //    sparkSession.sql("select s.sid,s.sname,s.sbirthday from student s where left(s.sbirthday,4) = (select left(sbirthday,4) from student where sid = '105') ").show()
    //    sparkSession.sql("select s.sid,s.sscore,t.tname from score s,teacher t,course c where t.tname = 'Zhang xu' and t.tid = c.tid and c.cid = s.scid").show()
    //    sparkSession.sql("select t.tname from teacher t,course c where t.tid = c.tid and c.cid in (select scid from score group by scid having count(scid) > 4)").show()
    //    sparkSession.sql("select * from student where sclass in (95033,95031)").show()
    //    sparkSession.sql("select c.cid from course c,score s where c.cid = s.scid and s.sscore > 85 group by c.cid").show()
    //    sparkSession.sql("select c.cid,c.cname,stu.sname,s.sscore from student stu,course c,score s,teacher t where s.sid = stu.sid and t.tid = c.tid and c.cid = s.scid and t.tdepart like '%computer%'").show()
    //    sparkSession.sql("select tname,ttitle from teacher where tdepart in ('computer science department','department of electronic engineering') and ttitle not in (select a.ttitle from teacher a,teacher b where a.ttitle = b.ttitle and a.tdepart = 'department of electronic engineering' and b.tdepart ='computer science department')").show()
    //    sparkSession.sql("select * from score where scid = '3-105' and sscore > (select sscore from score where scid = '3-245') order by sscore desc").show()
    //    sparkSession.sql("select * from score where scid = '3-105' and sscore > all(select sscore from score where scid = '3-245')").show()
    //    sparkSession.sql("select sname,ssex,sbirthday from student union select tname,tsex,tbirthday from teacher").show()
    //    sparkSession.sql("select sname,ssex,sbirthday from student where ssex = 'female' union select tname,tsex,tbirthday from teacher where tsex = 'female'").show()
    //    sparkSession.sql("select * from score a where sscore < (select avg(sscore) from score b where a.scid = b.scid)").show()
    //    sparkSession.sql("select tname,tdepart from teacher where tid in (select tid from course)").show()
    //    sparkSession.sql("select tname,tdepart from teacher where tid not in (select tid from course)").show()
    //    sparkSession.sql("select sclass from student group by sclass having count(*) > 2").show()
    //    sparkSession.sql("select * from student where sname not like 'Wang%'").show()
    //    sparkSession.sql("select sname as name,(2018-left(sbirthday,4)) as age from student").show()
    //    sparkSession.sql("select sname,sbirthday from student where sbirthday = (select max(sbirthday) from student) union select sname,sbirthday from student where sbirthday = (select min(sbirthday) from student)").show()
    //    sparkSession.sql("select * from student order by sclass desc,sbirthday desc").show()
    //    sparkSession.sql("select tname,cname from teacher,course where tsex = 'male' and teacher.tid = course.tid").show()
    //    sparkSession.sql("select * from score where sscore in (select min(sscore) from score union select max(sscore) from score)").show()
    //    sparkSession.sql("select sname from student where ssex = (select ssex from student where sname = 'Liu Jun')").show()
    //    sparkSession.sql("select sname from student where ssex = (select ssex from student where sname = 'Liu Jun') and sclass = (select sclass from student where sname = 'Liu Jun')").show()
    sparkSession.sql("select stu.sname,s.sscore from student stu,score s,course c where stu.sid = s.sid and s.scid = c.cid and stu.ssex = 'male' and c.cname = 'Introduction to computer'").show()
  }
}

你可能感兴趣的:(大数据)