sql去重

一.数据库清洗去重

在数据库里清洗时,会用到DELETE语句进行操作,很多时候需要删除重复记录保存,保存一条。百度之后有些语句会报错,直到发现一个在实战中可以用不会报错的,并且跑起来很快的一条sql语句

DELETE consum_record
FROM
 consum_record, 
 (
  SELECT
   min(id) id,
   user_id,
   monetary,
   consume_time
  FROM
   consum_record
  GROUP BY
   user_id,
   monetary,
   consume_time
  HAVING
   count(*) > 1
 ) t2
WHERE
 consum_record.user_id = t2.user_id 
 and consum_record.monetary = t2.monetary
 and consum_record.consume_time = t2.consume_time
AND consum_record.id > t2.id;

1.(SELECT min(id) id,user_id,monetary,consume_time FROM consum_record GROUP BY user_id,monetary,consume_time HAVING count(*) > 1 ) t2  将重复数据建一张临时表,集合里是重复记录的最小id

2.关联两张表,根据条件删除原表大于投t2表的记录,这样就可以去重保留一条

 

二.查询去重

查询去重有两种方法,一个是distinct,一个是group by,distinct 用于select 语句中,group by使用的频率相对较高,它的目的是用来进行聚合统计的,但也可以实现去重的功能。速度会慢与distinct

你可能感兴趣的:(数据库)