Trips 表数据结构
+-------------+----------+
| Column Name | Type |
+-------------+----------+
| id | int |
| client_id | int |
| driver_id | int |
| city_id | int |
| status | enum |
| request_at | date |
+-------------+----------+
这张表中存所有出租车的行程信息。每段行程有唯一 id ,其中 client_id 和 driver_id 是 Users 表中 users_id 的外键
status 是一个表示行程状态的枚举类型,枚举成员为(‘completed’, ‘cancelled_by_driver’, ‘cancelled_by_client’)
Users表数据结构
+-------------+----------+
| Column Name | Type |
+-------------+----------+
| users_id | int |
| banned | enum |
| role | enum |
+-------------+----------+
这张表中存所有用户,每个用户都有一个唯一的 users_id ,role 是一个表示用户身份的枚举类型,枚举成员为 (‘client’, ‘driver’, ‘partner’)
banned 是一个表示用户是否被禁止的枚举类型,枚举成员为 (‘Yes’, ‘No’)
取消率 的计算方式如下:(被司机或乘客取消的非禁止用户生成的订单数量) / (非禁止用户生成的订单总数)
编写解决方案找出 "2013-10-01"
至 "2013-10-03"
期间非禁止用户(乘客和司机都必须未被禁止)的取消率。非禁止用户即 banned 为 No 的用户,禁止用户即 banned 为 Yes 的用户。其中取消率 Cancellation Rate
需要四舍五入保留 两位小数
返回结果表中的数据 无顺序要求
示例 1:
输入:
Trips 表:
+----+-----------+-----------+---------+---------------------+------------+
| id | client_id | driver_id | city_id | status | request_at |
+----+-----------+-----------+---------+---------------------+------------+
| 1 | 1 | 10 | 1 | completed | 2013-10-01 |
| 2 | 2 | 11 | 1 | cancelled_by_driver | 2013-10-01 |
| 3 | 3 | 12 | 6 | completed | 2013-10-01 |
| 4 | 4 | 13 | 6 | cancelled_by_client | 2013-10-01 |
| 5 | 1 | 10 | 1 | completed | 2013-10-02 |
| 6 | 2 | 11 | 6 | completed | 2013-10-02 |
| 7 | 3 | 12 | 6 | completed | 2013-10-02 |
| 8 | 2 | 12 | 12 | completed | 2013-10-03 |
| 9 | 3 | 10 | 12 | completed | 2013-10-03 |
| 10 | 4 | 13 | 12 | cancelled_by_driver | 2013-10-03 |
+----+-----------+-----------+---------+---------------------+------------+
Users 表:
+----------+--------+--------+
| users_id | banned | role |
+----------+--------+--------+
| 1 | No | client |
| 2 | Yes | client |
| 3 | No | client |
| 4 | No | client |
| 10 | No | driver |
| 11 | No | driver |
| 12 | No | driver |
| 13 | No | driver |
+----------+--------+--------+
输出:
+------------+-------------------+
| Day | Cancellation Rate |
+------------+-------------------+
| 2013-10-01 | 0.33 |
| 2013-10-02 | 0.00 |
| 2013-10-03 | 0.50 |
+------------+-------------------+
解释:
2013-10-01:
统计每天非禁止用户的取消率
查找非禁止用户信息
-- 错误示例
SELECT T.*, U.banned FROM Trips AS T JOIN Users AS U
ON (T.client_id = U.users_id OR T.driver_id = U.users_id ) AND U.banned ='No'
乍一看,思路是对。其实是错误的。因为,我们不知觉得肯定了一个假设—— client_id 与 driver_id 是相同的。只有当两者相同时,才能用此条件排除被禁止用户的行程记录
错误的结果
id | client_id | driver_id | city_id | status | request_at | users_id | banned | role |
---|---|---|---|---|---|---|---|---|
1 | 1 | 10 | 1 | completed | 2013-10-01 | 10 | No | driver |
1 | 1 | 10 | 1 | completed | 2013-10-01 | 1 | No | client |
2 | 2 | 11 | 1 | cancelled_by_driver | 2013-10-01 | 11 | No | driver |
3 | 3 | 12 | 6 | completed | 2013-10-01 | 12 | No | driver |
3 | 3 | 12 | 6 | completed | 2013-10-01 | 3 | No | client |
4 | 4 | 13 | 6 | cancelled_by_client | 2013-10-01 | 13 | No | driver |
4 | 4 | 13 | 6 | cancelled_by_client | 2013-10-01 | 4 | No | client |
5 | 1 | 10 | 1 | completed | 2013-10-02 | 10 | No | driver |
5 | 1 | 10 | 1 | completed | 2013-10-02 | 1 | No | client |
6 | 2 | 11 | 6 | completed | 2013-10-02 | 11 | No | driver |
Users 表中 users_id = 2
是被禁止了, 但查询结果中,Trips 表中 id = 6
的数据 client_id = 2
, 其行程记录没被剔除掉
正确的sql
-- 正确示例
SELECT * FROM Trips AS tt
JOIN Users AS U1 ON (tt.client_id = U1.users_id AND U1.banned ='No')
JOIN Users AS U2 ON (tt.driver_id = U2.users_id AND U2.banned ='No')
查询结果
id | client_id | driver_id | city_id | status | request_at | users_id | banned | role | users_id | banned | role |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | 10 | 1 | completed | 2013-10-01 | 1 | No | client | 1 | No | client |
3 | 3 | 12 | 6 | completed | 2013-10-01 | 3 | No | client | 3 | No | client |
4 | 4 | 13 | 6 | cancelled_by_client | 2013-10-01 | 4 | No | client | 4 | No | client |
5 | 1 | 10 | 1 | completed | 2013-10-02 | 1 | No | client | 1 | No | client |
7 | 3 | 12 | 6 | completed | 2013-10-02 | 3 | No | client | 3 | No | client |
9 | 3 | 10 | 12 | completed | 2013-10-03 | 3 | No | client | 3 | No | client |
10 | 4 | 13 | 12 | cancelled_by_driver | 2013-10-03 | 4 | No | client | 4 | No | client |
按照天统计, 非禁止用户的取消行程数 和 总行程
group by request_at
SUM(IF(tt.status = 'completed',0 ,1))
count(tt.status)
--
select round(SUM(IF(tt.status = 'completed',0 ,1))/count(tt.status), 2) from Trips tt
join Users tu1 on (tt.client_id = tu1.users_id and tu1.banned = 'No')
join Users tu2 on (tt.driver_id = tu2.users_id and tu2.banned = 'No')
group by request_at;
增加 Day 和 Cancellation Rate
select tt.request_at as Day, round(SUM(IF(tt.status = 'completed',0 ,1))/count(tt.status), 2) as 'Cancellation Rate'
from Trips tt
join Users tu1 on (tt.client_id = tu1.users_id and tu1.banned = 'No')
join Users tu2 on (tt.driver_id = tu2.users_id and tu2.banned = 'No')
where tt.request_at BETWEEN '2013-10-01' AND '2013-10-03'
group by tt.request_at;