请看题:
已知用户行为表 tracking_log, 大概字段有:
(user_id 用户编号, op_id 操作编号, op_time 操作时间)
要求:统计每天符合以下条件的用户数:A操作之后是B操作,AB操作必须相邻。
生成数据,可以在 sqlfiddle 中测试:
create table tracking_log(
id int primary key AUTO_INCREMENT,
user_id int not null,
op_id char(4) not null,
op_time datetime not null
);
insert into tracking_log(user_id, op_id, op_time) values
(1, 'A', '2020-1-1 12:01:03'),
(2, 'A', '2020-1-1 12:01:04'),
(3, 'A', '2020-1-1 12:01:05'),
(1, 'B', '2020-1-1 12:03:03'),
(1, 'A', '2020-1-1 12:04:03'),
(1, 'C', '2020-1-1 12:06:03'),
(2, 'A', '2020-1-1 12:07:04'),
(3, 'B', '2020-1-1 12:08:05'),
(2, 'C', '2020-1-1 12:09:03'),
(2, 'A', '2020-1-1 12:10:03'),
(1, 'A', '2020-1-2 12:01:03'),
(2, 'A', '2020-1-2 12:01:04'),
(3, 'A', '2020-1-2 12:01:05'),
(1, 'B', '2020-1-2 12:03:03'),
(1, 'A', '2020-1-2 12:04:03'),
(1, 'C', '2020-1-2 12:06:03'),
(2, 'A', '2020-1-2 12:07:04'),
(3, 'B', '2020-1-2 12:08:05'),
(2, 'C', '2020-1-2 12:09:03'),
(2, 'A', '2020-1-2 12:10:03');
首先,每日每个用户的行为可以视为一个序列,自然想到用 group_concat
把每个人的所有行为拼接成一个字符串:
select convert(op_time, date) as date, user_id, group_concat(op_id order by op_time) as track
from tracking_log
group by convert(op_time, date), user_id
order by date, user_id
;
需要注意 group_concat 里要用到 order by,否则顺序不能保证一致!!
接下来就简单了吧,直接用字符查找就可以找到关心的行为模式:
select convert(op_time, date) as date, user_id, group_concat(op_id order by op_time) as track
from tracking_log
group by convert(op_time, date), user_id
having group_concat(op_id order by op_time) like '%A,B%'
order by date, user_id
;
select t.date, count(*) as num from
(
select convert(op_time, date) as date, user_id
from tracking_log
group by convert(op_time, date), user_id
having group_concat(op_id order by op_time) like '%A,B%'
) t
group by t.date
;