MySQL重复数据删除处理

文章目录

    • 一 背景
    • 二 准备
    • 三 处理
        • 1 删除重复数据,只保留一条数据
            • 1.1 保留id最大的那一条
            • 1.2 保留id最小的那一条
        • 2 删除重复数据,只删除一条数据
            • 2.1 删除id最大的那一条
            • 2.2 删除id最小的那一条
    • 四 拓展

一 背景

​ 在实际业务开发过程中,经常会出现需要删除脏数据的情况,其中由于没有严格控制并发导致的重复数据删除尤为典型。处理脏数据的方式一般有两种,一是通过sql处理,二是通过代码处理。对于复杂的脏数据,代码处理是最推荐的,通过代码的逻辑,可以准确地控制不同情况,然后针对性测试。但很多时候脏数据处理是一次性的,如果开发投入精力过多,明显是一件性价比不高的事情。所以大多数情况下,开发会偏向于通过sql方式来修复数据,只需要写一个sql到线上环境跑一下(前提是可以写得出来,哈哈),都不需要上线代码,就可以解决根本性问题。本次针对重复性脏数据情况,根据不同需求场景,来通过sql方式处理脏数据。

二 准备

准备一张热腾腾的数据表:

CREATE TABLE `user_group` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `uid` int(11) NOT NULL,
  `gid` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8;

插入数据:

insert  into `user_group`(`id`,`uid`,`gid`) values 
(1,1,1),
(2,1,1),
(3,1,2),
(4,1,3),
(5,2,1),
(6,2,2),
(7,2,2),
(8,2,2);

假定通过uid和gid联合作为唯一判断,那么重复的数据有:[1,2],[6,7,8]

三 处理

1 删除重复数据,只保留一条数据

1.1 保留id最大的那一条

删除的数据应为[1,6,7]

#### QUERY
SELECT a.* FROM  user_group AS a,
(
SELECT *,MAX(id) AS max_id FROM user_group GROUP BY uid,gid HAVING COUNT(1) > 1
) AS b
 WHERE a.uid = b.uid AND a.gid = b.gid AND a.id < b.max_id;
 
#### DELETE
DELETE a FROM  user_group AS a,
(
SELECT *,MAX(id) AS max_id FROM user_group GROUP BY uid,gid HAVING COUNT(1) > 1
) AS b
 WHERE a.uid = b.uid AND a.gid = b.gid AND a.id < b.max_id;
1.2 保留id最小的那一条

删除的数据应为[2,7,8]

#### QUERY
SELECT a.* FROM  user_group AS a,
(
SELECT *,MIN(id) AS min_id FROM user_group GROUP BY uid,gid HAVING COUNT(1) > 1
) AS b
 WHERE a.uid = b.uid AND a.gid = b.gid AND a.id > b.min_id;
 
#### DELETE
DELETE a FROM  user_group AS a,
(
SELECT *,MIN(id) AS min_id FROM user_group GROUP BY uid,gid HAVING COUNT(1) > 1
) AS b
 WHERE a.uid = b.uid AND a.gid = b.gid AND a.id > b.min_id;

2 删除重复数据,只删除一条数据

2.1 删除id最大的那一条

删除的数据应为[2,8]

#### QUERY
 SELECT a.* FROM  user_group AS a,
(
SELECT *,MAX(id) AS max_id FROM user_group GROUP BY uid,gid HAVING COUNT(1) > 1
) AS b
 WHERE a.uid = b.uid AND a.gid = b.gid AND a.id = b.max_id;
 
#### DELETE
DELETE a FROM  user_group AS a,
(
SELECT *,MAX(id) AS max_id FROM user_group GROUP BY uid,gid HAVING COUNT(1) > 1
) AS b
 WHERE a.uid = b.uid AND a.gid = b.gid AND a.id = b.max_id;
2.2 删除id最小的那一条

删除的数据应为[1,6]

#### QUERY
SELECT a.* FROM  user_group AS a,
(
SELECT *,MIN(id) AS min_id FROM user_group GROUP BY uid,gid HAVING COUNT(1) > 1
) AS b
 WHERE a.uid = b.uid AND a.gid = b.gid AND a.id = b.min_id;
 
#### DELETE
DELETE a FROM  user_group AS a,
(
SELECT *,MIN(id) AS min_id FROM user_group GROUP BY uid,gid HAVING COUNT(1) > 1
) AS b
 WHERE a.uid = b.uid AND a.gid = b.gid AND a.id = b.min_id;

四 拓展

https://blog.csdn.net/yueliangdao0608/article/details/2306771

你可能感兴趣的:(MySQL,mysql)