X 市建了一个新的体育馆,每日人流量信息被记录在这三列信息中:序号 (id)、日期 (visit_date)、 人流量 (people)。
请编写一个查询语句,找出人流量的高峰期。高峰期时,至少连续三行记录中的人流量不少于100。
例如,表 stadium:
+------+------------+-----------+
| id | visit_date | people |
+------+------------+-----------+
| 1 | 2017-01-01 | 10 |
| 2 | 2017-01-02 | 109 |
| 3 | 2017-01-03 | 150 |
| 4 | 2017-01-04 | 99 |
| 5 | 2017-01-05 | 145 |
| 6 | 2017-01-06 | 1455 |
| 7 | 2017-01-07 | 199 |
| 8 | 2017-01-08 | 188 |
+------+------------+-----------+
对于上面的示例数据,输出为:
+------+------------+-----------+
| id | visit_date | people |
+------+------------+-----------+
| 5 | 2017-01-05 | 145 |
| 6 | 2017-01-06 | 1455 |
| 7 | 2017-01-07 | 199 |
| 8 | 2017-01-08 | 188 |
+------+------------+-----------+
提示:
每天只有一行记录,日期随着 id 的增加而增加。
生成数据
CREATE TABLE IF NOT EXISTS stadium (id INT, visit_date DATE NULL, people INT);
INSERT INTO stadium (id, visit_date, people) VALUES ('1', '2017-01-01', '10');
INSERT INTO stadium (id, visit_date, people) VALUES ('2', '2017-01-02', '109');
INSERT INTO stadium (id, visit_date, people) VALUES ('3', '2017-01-03', '150');
INSERT INTO stadium (id, visit_date, people) VALUES ('4', '2017-01-04', '99');
INSERT INTO stadium (id, visit_date, people) VALUES ('5', '2017-01-05', '145');
INSERT INTO stadium (id, visit_date, people) VALUES ('6', '2017-01-06', '1455');
INSERT INTO stadium (id, visit_date, people) VALUES ('7', '2017-01-07', '199');
INSERT INTO stadium (id, visit_date, people) VALUES ('8', '2017-01-08', '188');
INSERT INTO stadium (id, visit_date, people) VALUES ('9', '2017-01-09', '66');
解答
先把所有超过100的选出来
SELECT *
FROM stadium AS S
WHERE S.`people` >=100;
对上表三次连接 主表连接id增1 再连接id增1
SELECT *
FROM (SELECT *
FROM stadium AS S
WHERE S.`people` >=100) tmp1
JOIN (SELECT *
FROM stadium AS S
WHERE S.`people` >=100) tmp2
ON tmp1.id = tmp2.id-1
JOIN (SELECT *
FROM stadium AS S
WHERE S.`people` >=100) tmp3
ON tmp2.id = tmp3.id-1
把三个表的id 去重union 就可以得到结果 但是感觉代码好繁琐啊
参考别人的方法 可以这么写
取出三列id 再和表连接选出等于这三列的记录 然后去重
SELECT DISTINCT SS.`id`, SS.`visit_date`, SS.`people`
FROM stadium AS SS
JOIN (SELECT tmp1.id AS id1, tmp2.id AS id2, tmp3.id AS id3
FROM (SELECT *
FROM stadium AS S
WHERE S.`people` >=100) tmp1
JOIN (SELECT *
FROM stadium AS S
WHERE S.`people` >=100) tmp2
ON tmp1.id = tmp2.id-1
JOIN (SELECT *
FROM stadium AS S
WHERE S.`people` >=100) tmp3
ON tmp2.id = tmp3.id-1
) AS a
ON SS.`id` = a.id1 OR SS.`id` = a.id2 OR SS.`id` = a.id3;
但这也太繁琐了吧 而且不方便推广到高峰为n天超过100人的情况吧
感觉还是适合用变量来做
1.先用查询算出连续不小于 100 出现的统计,记为countt(小于 100 的值为0,不小于的值在上一次的基础上加一)。
SELECT S.`id`, S.`visit_date`, S.`people`,
IF(S.`people`>=100, @countt:=@countt + 1, @countt:=0) AS countt
FROM stadium AS S, (SELECT @countt:=0) AS init
countt大于等于3的一定是可以的 可是怎么把countt为1、2且满足条件的选出来呢
2.对第1步的结果增加一个标记位flag,倒叙看countt,不小于3或上一flag为1,并且countt不等于0的,标记flag为1
解释:按id倒叙再产生一个flag 列 所有count>=3的行 flag=1 同时如果上一行flag=1 并且这一行的count不为0的 flag 也等于1
SELECT *, @flag:=IF(tmp.countt>=3 OR (@flag=1 AND tmp.countt <> 0), 1, 0) AS flag
FROM (SELECT S.`id`, S.`visit_date`, S.`people`,
IF(S.`people`>=100, @countt:=@countt + 1, @countt:=0) AS countt
FROM stadium AS S, (SELECT @countt:=0) AS init) tmp,
(SELECT @falg:=0) AS init2
ORDER BY tmp.id DESC
感慨一下是真的牛批 倒叙想不到啊
之后把flag为1的选出来即可
SELECT tmp1.id, tmp1.`visit_date`, tmp1.`people`
FROM (
SELECT *, @flag:=IF(tmp.countt>=3 OR (@flag=1 AND tmp.countt <> 0), 1, 0) AS flag
FROM (SELECT S.`id`, S.`visit_date`, S.`people`,
IF(S.`people`>=100, @countt:=@countt + 1, @countt:=0) AS countt
FROM stadium AS S, (SELECT @countt:=0) AS init) tmp,
(SELECT @falg:=0) AS init2
ORDER BY tmp.id DESC) tmp1
WHERE tmp1.flag = 1
ORDER BY tmp1.id;
强无敌 别的方法都不想看了。。。
等等!
还有一个定义变量的方法更加简洁
SELECT id,visit_date,people,(
@cnt:=IF(people>99,@cnt+1,0)
) AS cnt
FROM stadium,(SELECT @cnt:=0) AS init;
这部分与前边的一致
然后做了一个笛卡尔乘积
SELECT *
FROM stadium AS S, (SELECT id,visit_date,people,(
@cnt:=IF(people>99,@cnt+1,0)
) AS cnt
FROM stadium,(SELECT @cnt:=0) AS init) AS tmp
结果劝退 要是遇到大数据量不就gg了
只要取出cnt大于2的数据及前cnt-1天的数据,由于id是连续的,可以取出当前id及前cnt-1个id的数据,存在超过3天连续的情况,对取出的结果去重。
SELECT DISTINCT S.*
FROM stadium AS S, (SELECT id,visit_date,people,(
@cnt:=IF(people>99,@cnt+1,0)
) AS cnt
FROM stadium,(SELECT @cnt:=0) AS init) AS tmp
WHERE tmp.cnt>2 AND S.`id` BETWEEN tmp.id-tmp.cnt+1 AND tmp.id
别的方法
SELECT DISTINCT S1.*
FROM stadium AS S1,stadium AS S2,stadium AS S3
WHERE S1.people>=100 AND S2.people>=100 AND S3.people>=100 AND (
S1.id +1 = S2.id AND S1.id+2=S3.id OR
S1.id +1 = S2.id AND S1.id-1=S3.id OR
S1.id -1 = S2.id AND S1.id-2=S3.id
)
ORDER BY S1.id