MYSQL中重复记录的查询和去重

在项目开发过程中,由于多次跑测试数据,出现了大量的重复数据的情况,因而需要将重复的数据进行删除,从而避免根据某些条件查询数据时,本应该出现一条,实际 出现多条的情况,导致在mybatis中报错~

举例说明:如下是一个表结构:

CREATE TABLE `sys_user_auth` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `user_id` varchar(20) DEFAULT NULL COMMENT '用户ID',
  `type` tinyint(2) DEFAULT NULL COMMENT '1.可查询仓库 2.可操作仓库 3.库存可查询仓库 4.品牌 5.品类 6.薪资可操作仓 7.人事可操作仓',
  `auth` varchar(45) DEFAULT NULL COMMENT '权限代码',
  `remarks` varchar(45) DEFAULT NULL,
  `del_flag` char(1) DEFAULT NULL,
  `create_by` varchar(20) DEFAULT NULL,
  `create_date` timestamp NULL DEFAULT NULL,
  `update_by` varchar(20) DEFAULT NULL,
  `update_date` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `user_id` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=2959050 DEFAULT CHARSET=utf8mb4 COMMENT='用户权限控制表';

1、查找表中多余的重复记录,重复记录是根据单个字段(auth)来判断

SELECT
    *
FROM
    `sys_user_auth`

WHERE
    auth IN (
        SELECT
            auth
        FROM          
             `sys_user_auth`
        GROUP BY
            auth
        HAVING
            count(auth) > 1
    )

2、删除表中多余的重复记录,重复记录是根据单个字段(auth)来判断,只留有id最小的记录

错误:

DELETE
FROM
    `sys_user_auth`
WHERE
    auth IN (

        SELECT
          a.auth
        FROM
            `sys_user_auth` a 
        GROUP BY
		  a.auth
        HAVING
            count(*) > 1
    )
AND id NOT IN (
    SELECT
        min(b.id) as id 
    FROM
          `sys_user_auth` b
    GROUP BY
      b.auth
    HAVING
        count(*) > 1
)

执行此sql发现会报:

[Err] 1093 - You can't specify target table 'sys_user_auth' for update in FROM clause错误,即不能查询此表的同时对它做更新

解决此问题则需要再加入一层SELECT即可

DELETE
FROM
    `sys_user_auth`
WHERE
    auth IN (
      SELECT c.auth FROM
      (            
        SELECT
          a.auth
        FROM
            `sys_user_auth` a 
        GROUP BY
		  a.auth
        HAVING
            count(*) > 1
      ) C
    )
AND id NOT IN (
    SELECT d.id FROM
       (
        SELECT
        min(b.id) as id 
        FROM
          `sys_user_auth` b
        GROUP BY
          b.auth
        HAVING
            count(*) > 1
        ) d
)

3、查找表中多余的重复记录(多个字段)

SELECT * FROM   `sys_user_auth` 
WHERE
    (user_id,type,auth) IN (

        SELECT
            a.user_id,
            a.type,
			a.auth
        FROM
            `sys_user_auth` a 
        GROUP BY
            a.user_id,
            a.type,
			a.auth
        HAVING
            count(*) > 1
    )

4、删除表中多余的重复记录(多个字段),只留有id最小的记录

DELETE
FROM
    `sys_user_auth`
WHERE
    (user_id,type,auth) IN (
			SELECT  c.user_id,c.type,c.auth FROM
			(
        SELECT
            a.user_id,
            a.type,
			a.auth
        FROM
            `sys_user_auth` a 
        GROUP BY
            a.user_id,
            a.type,
			a.auth
        HAVING
            count(*) > 1
			) c 
    )
AND id NOT IN (
	SELECT d.id FROM
	(
    SELECT
        min(b.id) as id 
    FROM
          `sys_user_auth` b
    GROUP BY
            b.user_id,
            b.type,
			b.auth
    HAVING
        count(*) > 1
	) d
)

6.消除一个字段的左边的第一位:

UPDATE tableName
SET [ Title ]= RIGHT ([ Title ],(len([ Title ]) - 1))
WHERE
    Title LIKE '村%'

7.消除一个字段的右边的第一位:

UPDATE tableName
SET [ Title ]= LEFT ([ Title ],(len([ Title ]) - 1))
WHERE
    Title LIKE '%村'

8.假删除表中多余的重复记录(多个字段),不包含rowid最小的记录

UPDATE vitae
SET ispass =- 1
WHERE
    peopleId IN (
        SELECT
            peopleId
        FROM
            vitae
        GROUP BY
            peopleId

你可能感兴趣的:(MYSQL中重复记录的查询和去重)