原表记录:
zs,math,75
zs,chinese,81
lisi,chinese,76
lisi,math,89
wangwu,math,100
wangwu,chinese,81
wangwu,english,99
zhaoliu,math,16
zhaoliu,english,48
zhaoliu,chinese,59
需要对name进行分组,对成绩进行排名,所以考虑使用窗口函数,目前mysql 8.0支持窗口函数,该操作是在hive上执行的查询结果:
select * from (select *, rank() over(partition by name order by fenshu desc) as flag from studentss)t where t.flag<=2;
lisi math 89 1
lisi chinese 76 2
wangwu math 100 1
wangwu english 99 2
zhaoliu chinese 59 1
zhaoliu english 48 2
zs chinese 81 1
zs math 75 2
select * from (select *, rank() over(partition by kecheng order by fenshu desc) as flag from studentss)t where t.flag<=1;
wangwu chinese 81 1
zs chinese 81 1
wangwu english 99 1
wangwu math 100 1
select * from (select *, rank() over(partition by kecheng order by fenshu desc) as flag from studentss)t where t.flag<=2;
wangwu chinese 81 1
zs chinese 81 1
wangwu english 99 1
zhaoliu english 48 2
wangwu math 100 1
lisi math 89 2
sql中常规做法:
delete from studentss where bianhao not in (
select t.no from (
select min(bianhao) no from studentss group by xuehao,name,kechenghao,kecheng,fenshu
)t
)
通过窗口函数的做法:
select * from (select *, ROW_NUMBER() over(partition by name,kecheng,fenshu) as flag from studentss)t where flag>1;
这个题目源于,有一张team表,表种有a,b,c,d 四个队伍,求他们之间两两进行白赛的对阵信息。
原表:
a
b
c
d
select * from a,b where a.name>b.name;
b a
c a
d a
c b
d b
d c
我的做法应该是比较笨的做法,
select distinct name from students left join (
select kecheng,avg(fenshu) as avg_score from students group by kecheng
) tmp on students.kecheng=tmp.kecheng where students.fenshu>tmp.avg_score;
下面是使用窗口函数 解决上面的问题:
step1 : 通过窗口函数,对课程分组, 求课程的平均成绩:
select * from (select *, AVG(fenshu) over(partition by kecheng) as flag from studentss)t;
zhaoliu chinese 59 74.25
wangwu chinese 81 74.25
lisi chinese 76 74.25
zs chinese 81 74.25
zhaoliu english 48 73.5
wangwu english 99 73.5
zhaoliu math 16 70.0
wangwu math 100 70.0
lisi math 89 70.0
zs math 75 70.0
step2:过滤出 成绩大于平均成绩的学生
select * from (select *, AVG(fenshu) over(partition by kecheng) as flag from studentss)t where fenshu >t.flag;
wangwu chinese 81 74.25
lisi chinese 76 74.25
zs chinese 81 74.25
wangwu english 99 73.5
wangwu math 100 70.0
lisi math 89 70.0
zs math 75 70.0
select distinct name from students where name not in(
select distinct name from students left join (
select kecheng,avg(fenshu) as avg_score from students group by kecheng
) tmp on students.kecheng=tmp.kecheng where students.fenshu
窗口函数(因为hive是不支持子查询的,所以这里用left outer join 替代)
select distinct a.name from studentss a left outer join (select * from (select *, AVG(fenshu) over(partition by kecheng) as flag from studentss)t where t.flag<=fenshu) b on a.name = b.name and a.kecheng=b.kecheng and a.fenshu=b.fenshu and b.fenshu is not null;
未完待续
之后还会补充。。。。