sql-- 找到重复数据并删除、有重复数据不插入或更新的处理方法

表结构:

sql-- 找到重复数据并删除、有重复数据不插入或更新的处理方法_第1张图片

需求:找到相同的内容并删除

方法1:

通过分组找出筛选出count大于1的数据

select r_content, r_userid, count(*) from reviews GROUP BY r_content, r_userid having count(*) > 1

 group by可以利用聚合函数(count、 sum、 avg)进行分组

having 对分组的数据进行下一步筛选

 

 通过inner join找出相同数据

select *
from reviews a
         INNER JOIN
     (select r_content, r_userid, count(*) from reviews GROUP BY r_content, r_userid having count(*) > 1) b
     on a.r_content = b.r_content and a.r_userid = b.r_userid;

加行号

select a.*,
       if(@tmp = CONCAT(r_content, r_userid), @rownum := @rownum + 1, @rownum := 1) as rownum,
       @tmp := CONCAT(r_content, r_userid)
from (
         select a.*
         from reviews a
                  INNER JOIN
              (select r_content, r_userid, count(*) from reviews GROUP BY r_content, r_userid having count(*) > 1) b
              on a.r_content = b.r_content and a.r_userid = b.r_userid) a,
     (select @rownum := 0, @tmp := '') b

取出行号大于的删除之

delete from reviews where r_id in(
select r_id from (
select a.*,
       if(@tmp = CONCAT(r_content, r_userid), @rownum := @rownum + 1, @rownum := 1) as rownum,
       @tmp := CONCAT(r_content, r_userid)
from (
         select a.*
         from reviews a
                  INNER JOIN
              (select r_content, r_userid, count(*) from reviews GROUP BY r_content, r_userid having count(*) > 1) b
              on a.r_content = b.r_content and a.r_userid = b.r_userid) a,
     (select @rownum := 0, @tmp := '') b) a where rownum>1);

方法2:

通过group_concat来组合id,并删除

select group_concat(ids separator '|') as ids from(
select group_concat(r_id) as ids ,r_content,r_userid,count(*) from  reviews  group by r_content,r_userid having count(*)>1) a

 

有重复数据不插入或更新的处理方法

表设计:

需求:如果有相同内容则不插入,记录插入相同数据的条数

insert  into news (news_title, news_abstract,news_code) values ('这是一条新闻','新闻内容是XXX',md5(concat('这是一条新闻','新闻内容是XXX'))) on duplicate key update dumpnum=dumpnum+1;

转载于:https://www.cnblogs.com/8013-cmf/p/11168489.html

你可能感兴趣的:(sql-- 找到重复数据并删除、有重复数据不插入或更新的处理方法)