拼多多sql面试

1.如何写SQL求出中位数平均数和众数(除了用count之外的方法)
2.口答两个SQL题(一个跟留存率相关,一个要用到row number)

1)留存率:略

2)mysql中设置row number:

SET @row_number:= 0; SELECT (@row_number:=@row_number + 1) AS num FROM table
3.现有一个数据库表Tourists,记录了某个景点7月份每天来访游客的数量如下:
id date visits
1 2017-07-01 100 ……
现在请筛选出连续三天都有访客数大于100天的日期。 上面例子的输出为: date 2017-07-01 ……

CREATE TABLE tourists (date DATE, visits INTEGER);
insert into tourists (date, visits)values ('2017-07-01', 101);
insert into tourists (date, visits)values ('2017-07-02', 109);
insert into tourists (date, visits)values ('2017-07-03', 150);
insert into tourists (date, visits)values ('2017-07-04', 99);
insert into tourists (date, visits)values ('2017-07-05', 145);
insert into tourists (date, visits)values ('2017-07-06', 125);
insert into tourists (date, visits)values ('2017-07-07', 199);
insert into tourists (date, visits)values ('2017-07-08', 188);
insert into tourists (date, visits)values ('2017-07-09', 98);
insert into tourists (date, visits)values ('2017-07-10', 198);
insert into tourists (date, visits)values ('2017-07-11', 129);
insert into tourists (date, visits)values ('2017-07-12', 104);
insert into tourists (date, visits)values ('2017-07-13', 111);
  1. 用户登录日志表为user_id,log_id,session_id,plat,visit_date 用sql查询近30天每天平均登录用户数量 用sql查询出近30天连续访问7天以上的用户数量
    5.PV表a(表结构为user_id,goods_id),点击表b(user_id,goods_id),数据量各为50万条,在防止数据倾斜的情况下,写一句sql找出两个表共同的user_id和相应的goods_id
    set hive.auto.convert.join=trye
    select a.user_id,a.goods_id from a join b on coalesce(uid,concat(“hive”+rand()) = b.user_id;

6.提取一个月每日的新增用户量

7.提取当日有交互活动并且没有下单的用户

你可能感兴趣的:(hive)