使用hive的hql查询用户所在部门
dpt表
dpt_id dpt_name
1 产品
2 技术
user_dpt表
user_id dpt_id
1 1
2 1
3 2
4 2
5 3
1 1 产品
2 1 产品
3 2 技术
4 2 技术
5 3 其他部门
SELECT t1.user_id
,t1.dpt_id
,nvl(t2.dpt_name,"其它部门")
FROM dpt
LEFT JOIN user_dpt
ON dpt.dpt_id = user_dpt.user_id
;
查出每个学期每门课程最高分记录
course_score表:
id,name,course,score,term
1,zhangsan,数学,80,2015
2,lisi,语文,90,2016
3,lisi,数学,70,2016
4,wangwu,化学,80,2017
5,zhangsan,语文,85,2015
6,zhangsan,化学,80,2015
编写sql完成如下查询,一次查询实现好,也可以写多次查询实现:
1、查出每个学期每门课程高分记录(包含全部5个字段)
2、查出单个学期中语文课在90分以上的学生的数学成绩记录(包含全部5个字段)
--(1)
SELECT t.*
FROM course_score t,
(
SELECT course
,term
,MAX(score) score
FROM course_score
GROUP BY course
,term
) tmp
WHERE tmp.course = t.course
AND tmp.term = t.term
AND tmp.score = t.score
;
--(2)
SELECT t.*
FROM course_score t,
(
SELECT id
FROM course_score
WHERE course = "语文"
AND score > 90
) tmp
WHERE t.id = tmp.id
AND t.cource = "数学"
;
设计数据库表,用来存放学生基本信息,课程信息,学生的课 程及成绩,并给出sql语句,查询平均成绩大于85的所有学生
stu_1
id,name,age,addr
1,zs1,22,bj
2,zs2,22,bj
3,zs3,22,bj
4,zs4,22,bj
5,zs5,22,bj
course_1
cid,cname
1,语文
2,数学
3,政治
4,美术
5,历史
course_sc
id,cid,score
1,1,87
1,2,92
1,3,69
2,2,83
2,3,92
2,4,87
2,5,83
SELECT *
FROM stu_1,
(
SELECT id
,AVG(score) a
FROM course_sc
GROUP BY id
HAVING AVG(score) > 85
) tmp
WHERE stu_1.id = tmp.id
;
每个渠道的下单用户数、订单总金额
oid,uid,amount,channel,otime
1,100,19,a,2019-08-06 19:00:00
2,101,19,b,2019-08-06 19:00:01
3,100,19,a,2019-08-05 19:00:00
4,101,19,b,2019-08-05 19:00:01
5,102,19,a,2019-08-06 19:00:00
6,102,19,a,2019-08-06 19:00:01
SELECT channel
,COUNT(distinct user_id)
,SUM(amount)
FROM order
GROUP BY channel
ON dt = "2019-08-06"
;
登录且阅读的用户数,已经阅读书籍数量及其它
表A(登录表):
ds user_id
2019-08-06 1
2019-08-06 2
2019-08-06 3
2019-08-06 4
表B(阅读表):
ds user_id read_num
2019-08-06 1 2
2019-08-06 2 3
2019-08-06 3 6
表C(付费表):
ds user_id price
2019-08-06 1 55.6
2019-08-06 2 55.8
基于上述三张表,请使用hive的hql语句实现如下需求:
(1)、用户登录并且当天有个阅读的用户数,已经阅读书籍数量
(2)、用户登录并且阅读,但是没有付费的用户数
(3)、用户登录并且付费,付费用户书籍和金额
--(1)
SELECT COUNT(DISTINCT b.user_id)
,SUM(b.read_num)
FROM tableb b
LEFT JOIN tablea a
ON a.ds = b.ds AND a.user_id = b.user_id
GROUP BY ds
;
--(2)
SELECT c.ds
,COUNT(DISTINCT c.user_id)
FROM tablec c
LEFT JOIN tableb b
ON b.user_id = c.user_id AND b.ds = c.ds
LEFT JOIN tablea a
ON a.user_id = c.user_id AND a.ds = c.ds
WHERE c.user_id is null
GROUP BY c.ds
;
--(3)
SELECT c.user_id
,c.price
FROM tablec c join
ON tablea a
ON a.ds = c.ds AND a.user_id = c.user_id
;
高消费者报表
district表:
disid disname
1 华中
2 西南
city表:
cityid disid
1 1
2 1
3 2
4 2
5 2
order表:
oid userid cityid amount
1 1 1 1223.9
2 1 1 9999.9
3 2 2 2322
4 2 2 8909
5 2 3 6789
6 2 3 798
7 3 4 56786
8 4 5 78890
高消费者是消费金额大于1W的用户,使用hive hql生成如下报表:
区域名 高消费者人数 消费总额
SELECT b.disid
,c.disname
,a.userid
,SUM(a.amount)
FROM order_29 a
JOIN city b
ON a.cityid=b.cityid
JOIN district c
ON b.disid = c.disid
GROUP BY b.disid
,c.disname
,a.userid
;
使用hive的hql实现买过商品3的用户及其昨日消费
数据
orderid userid productid price timestamp date
123,00101,3,200,1535945356,2019-08-28
124,00100,1,200,1535945356,2019-08-28
125,00101,3,200,1535945356,2019-08-29
126,00101,2,200,1535945356,2019-08-29
127,00102,5,200,1535945356,2019-08-29
128,00103,3,200,1535945356,2019-08-29
129,00103,3,200,1535945356,2019-08-29
SELECT
userid,
cur+lastday
from
(SELECT userid
,sum(case when dt=current_date then price else 0 end) cur,
sum(case when dt=date_sub(current_date,1) then price else 0 end) lastday
FROM
(
SELECT userid
,price
,dt
FROM buy
WHERE productid=3
AND (dt = current_date or dt = date_sub(current_date,1))
) tmp
group by userid) tt
where cur != 0 and lastday != 0
;