题目
活动纪录表:Activity
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| user_id | int |
| session_id | int |
| activity_date | date |
| activity_type | enum |
+---------------+---------+
该表是用户在社交网站的活动记录。
该表没有主键,可能包含重复数据。
activity_type 字段为以下四种值 ('open_session', 'end_session', 'scroll_down', 'send_message')。
每个 session_id 只属于一个用户。
请写SQL查询出截至 2019-07-27(包含2019-07-27),近 30天的每日活跃用户(当天只要有一条活跃记录,即为活跃用户),
查询结果示例如下:
Activity table:
+---------+------------+---------------+---------------+
| user_id | session_id | activity_date | activity_type |
+---------+------------+---------------+---------------+
| 1 | 1 | 2019-07-20 | open_session |
| 1 | 1 | 2019-07-20 | scroll_down |
| 1 | 1 | 2019-07-20 | end_session |
| 2 | 4 | 2019-07-20 | open_session |
| 2 | 4 | 2019-07-21 | send_message |
| 2 | 4 | 2019-07-21 | end_session |
| 3 | 2 | 2019-07-21 | open_session |
| 3 | 2 | 2019-07-21 | send_message |
| 3 | 2 | 2019-07-21 | end_session |
| 4 | 3 | 2019-06-25 | open_session |
| 4 | 3 | 2019-06-25 | end_session |
+---------+------------+---------------+---------------+
Result table:
+------------+--------------+
| day | active_users |
+------------+--------------+
| 2019-07-20 | 2 |
| 2019-07-21 | 2 |
+------------+--------------+
非活跃用户的记录不需要展示。
生成数据
CREATE TABLE Activity1 (user_id INT, session_id INT, activity_date DATE, activity_type VARCHAR(20));
INSERT INTO Activity1 (user_id, session_id, activity_date, activity_type) VALUES ('1', '1', '2019-07-20', 'open_session');
INSERT INTO Activity1 (user_id, session_id, activity_date, activity_type) VALUES ('1', '1', '2019-07-20', 'scroll_down');
INSERT INTO Activity1 (user_id, session_id, activity_date, activity_type) VALUES ('1', '1', '2019-07-20', 'end_session');
INSERT INTO Activity1 (user_id, session_id, activity_date, activity_type) VALUES ('2', '4', '2019-07-20', 'open_session');
INSERT INTO Activity1 (user_id, session_id, activity_date, activity_type) VALUES ('2', '4', '2019-07-21', 'send_message');
INSERT INTO Activity1 (user_id, session_id, activity_date, activity_type) VALUES ('2', '4', '2019-07-21', 'end_session');
INSERT INTO Activity1 (user_id, session_id, activity_date, activity_type) VALUES ('3', '2', '2019-07-21', 'open_session');
INSERT INTO Activity1 (user_id, session_id, activity_date, activity_type) VALUES ('3', '2', '2019-07-21', 'send_message');
INSERT INTO Activity1 (user_id, session_id, activity_date, activity_type) VALUES ('3', '2', '2019-07-21', 'end_session');
INSERT INTO Activity1 (user_id, session_id, activity_date, activity_type) VALUES ('4', '3', '2019-06-25', 'open_session');
INSERT INTO Activity1 (user_id, session_id, activity_date, activity_type) VALUES ('4', '3', '2019-06-25', 'end_session');
解答
首先选择近 30天的每日活跃用户
SELECT *
FROM Activity1 AS A
WHERE DATEDIFF('2019-07-27', A.`activity_date`) <= 30;
然后对日期进行分组 选出每个组的uid个数(去重)
SELECT A.`activity_date` AS DAY, COUNT(DISTINCT A.`user_id`) AS active_users
FROM Activity1 AS A
WHERE DATEDIFF('2019-07-27', A.`activity_date`) <= 30
GROUP BY A.`activity_date`;