题目
编写一个 SQL 查询,来删除 Person 表中所有重复的电子邮箱,重复的邮箱里只保留 Id 最小 的那个。
+----+------------------+
| Id | Email |
+----+------------------+
| 1 | [email protected] |
| 2 | [email protected] |
| 3 | [email protected] |
+----+------------------+
Id 是这个表的主键。
例如,在运行你的查询语句之后,上面的 Person 表应返回以下几行:
+----+------------------+
| Id | Email |
+----+------------------+
| 1 | [email protected] |
| 2 | [email protected] |
+----+------------------+
提示:
执行 SQL 之后,输出是整个 Person 表。
使用 delete 语句。
审题
要去使用delete语句
创建数据
CREATE TABLE Person2(
Id INT,
Email VARCHAR(50),
PRIMARY KEY(Id)
);
INSERT INTO Person2 VALUE(1,'[email protected]'),(2,'[email protected] '),(3,'[email protected]');
自己的解答
如果不使用delete语句怎么来实现?
选择每个邮箱对应的最小Id 再把对应的行选出来即可
SELECT *
FROM Person2
WHERE Id IN (SELECT MIN(Id)
FROM Person2
GROUP BY Email);
DELETE语法
DELETE FROM table_name [WHERE Clause]
那我把不是对应最小Id的行删去不就好咯
DELETE FROM Person2
WHERE Id NOT IN (SELECT MIN(Id)
FROM Person2
GROUP BY Email);
错误代码: 1093
You can't specify target table 'Person2' for update in FROM clause
不知道为啥错了,看了一些评论说
不可以对同一个表即进行查询又更新删除操作,解决方案(把查询结果作为一个临时表)
确实 你在查一个表 又要删除这个表里的数据 确实是不合理的。
DELETE FROM Person2
WHERE Id NOT IN (SELECT tmp.Id
FROM (SELECT MIN(Id) AS Id
FROM Person2
GROUP BY Email) AS tmp);
别的解法
1.官方的解法
用两表自连接
SELECT *
FROM Person2 p1,
Person2 p2
WHERE
p1.Email = p2.Email;
我们要删除的是p1.id>p2.id的行
SELECT *
FROM Person2 p1,
Person2 p2
WHERE
p1.Email = p2.Email
AND p1.`Id` > p2.`Id`;
所以最终答案为:
DELETE p1
FROM Person2 p1,
Person2 p2
WHERE
p1.Email = p2.Email
AND p1.`Id` > p2.`Id`
注意
如果用了表别名,delete后要加别名
在DELETE官方文档中,给出了这一用法,比如下面这个DELETE语句
DELETE t1
FROM t1
LEFT JOIN t2
ON t1.id=t2.id
WHERE t2.id IS NULL;
这种DELETE方式很陌生,竟然和SELETE的写法类似。它涉及到t1和t2两张表,DELETE t1表示要删除t1的一些记录,具体删哪些,就看WHERE条件,满足就删;
这里删的是t1表中,跟t2匹配不上的那些记录。
所以,官方sql中,DELETE p1就表示从p1表中删除满足WHERE条件的记录。
2.集合的角度
设(按email分组,找到每组id最小的行)为集合A。那么,想要从全集U中保留集合A,需要删除(U-A)。求差积,应用LEFT JOIN。
就是如图这种情况
DELETE U
FROM Person AS U
LEFT JOIN (
SELECT MIN(id) AS `id`,email
FROM Person
GROUP BY email
) AS A ON (U.email = A.email AND U.id = A.id)
WHERE A.id IS NULL
3.和官方解法差不多
只不过连接的表2是按email分组,找到每组id最小的行。
从原表中DELETE掉不在表2中的行。
DELETE P1
FROM Person AS P1,
(
SELECT MIN(P.id) AS `Id`,P.Email
FROM Person AS P
GROUP BY P.email
) AS P2
WHERE P1.Id != P2.Id AND P1.Email = P2.Email
DELETE P1
FROM Person AS P1
JOIN
(
SELECT MIN(P.id) AS `Id`,P.Email
FROM Person AS P
GROUP BY P.email
) AS P2
ON (P1.Id != P2.Id AND P1.Email = P2.Email)