主键采用雪花ID, 最小ID 1630961122999455744, 最大ID 1631593704371969722
测试数据量600w, 每次只删除5k条数据
限制数据的主键范围,然后通过 LIMIT 控制每次删除的数据范围
缺点: 当主键范围过大时,删除效率随着时间推移越来越慢
// java
public void range() {
Long min = 1630961122999455744L;
Long max = 1631593704371969722L;
int limit = 5000;
int num = limit;
int total = 0;
int counter = 1;
long sleepTotal = 0;
log.info("根据主键范围删除开始...");
long s = System.currentTimeMillis();
while (num == limit) {
long startMill = System.currentTimeMillis();
num = mapper.delByRange(min, max, limit);
long endMill = System.currentTimeMillis();
long sleep = RandomUtil.randomLong(100, 200);
total = total + num;
if (ThreadLocalRandom.current().nextInt(100) < 20) {
log.info("【[随机抽样打印]主键游标范围删除数据】已循环{}次,当前删除{}条,已删除{}条,耗时:{},休眠:{}",
counter, num, total, (endMill - startMill), sleep);
}
Thread.sleep(sleep);
counter++;
sleepTotal = sleepTotal + sleep;
}
long e = System.currentTimeMillis();
log.info("根据主键范围删除,循环删除次数:{},删除总数:{},休眠总耗时:{},总耗时:{}",
counter, total, sleepTotal, (e - s));
}
DELETE FROM `table`
WHERE `id` BETWEEN 1630961122999455744 AND 1631593704371969722
LIMIT 5000
最小主键每次加5k, 即1630961122999455744+5000, 然后每次都加+5000去循环删除
缺点: 当主键不连续时,会存在大量空删的情况,空删过多时,如果不进行休眠会由于频繁请求mysql空删,导致mysql的cpu飙升, 进行休眠时又会由于空删次数太多,导致整体休眠时间过长,进而降低删除效率
// java
public void add() {
int limit = 5000;
Long min = 1630961122999455744L;
Long max = min + limit;
int num = 1;
int total = 0;
int counter = 1;
int emptyCounter = 0;
long sleepTotal = 0;
long emptySleepTotal = 0;
log.info("从最小主键开始,每次+5k删除开始...");
long s = System.currentTimeMillis();
while (num > 0) {
long startMill = System.currentTimeMillis();
num = mapper.delByIncreament(routing, min, max, limit);
long endMill = System.currentTimeMillis();
long sleep = num == 0 ? RandomUtil.randomLong(10, 20) : RandomUtil.randomLong(100, 200);
if (num == 0) {
emptyCounter++;
emptySleepTotal = emptySleepTotal + sleep;
}
total = total + num;
if (ThreadLocalRandom.current().nextInt(100) < 20) {
log.info("【[随机抽样打印]主键游标每次加5k删除数据】当前删除{}条,已删除{}条,耗时:{},休眠:{}",
num, total, (endMill - startMill), sleep);
}
if (max.compareTo(1631593704371969722L) < 0) {
num = 1;
}
Thread.sleep(sleep);
counter++;
min = max + 1L;
max = min + 5000L;
if (max.compareTo(1631593704371969722L) > 0) {
max = 1631593704371969722L;
}
sleepTotal = sleepTotal + sleep;
}
long e = System.currentTimeMillis();
log.info("从最小主键开始,每次+5k删除,循环删除次数:{},空删次数:{},删除总数:{},休眠总耗时:{},空删休眠总耗时:{},总耗时:{}",
counter, emptyCounter, total, sleepTotal, emptySleepTotal, (e - s));
}
从最小主键开始,每次拿往后第5000条和第5001条,第5001条用于下次偏移删除,从5001条继续往后找5000条和5001条,如此往复直至删除所有数据
public void offset() {
int limit = 5000;
int num = limit;
Long minId = 1630961122999455744L;
Long maxId = 1631593704371969722L;
int total = 0;
int counter = 1;
long sleepTotal = 0;
long delTotal = 0;
long scanTotal = 0;
log.info("根据主键循环偏移删除开始...");
Long startId = minId;
List ids = mapper.scanId(startId, limit);
Long endId = ids.size() == 0 ? maxId : ids.get(0);
Long nextId = ids.size() > 1 ? ids.get(1) : null;
long s = System.currentTimeMillis();
while (num == limit) {
long startMill = System.currentTimeMillis();
num = mapper.delByScan(startId, endId, limit);
long endMill = System.currentTimeMillis();
delTotal = delTotal + (endMill - startMill);
total = total + num;
if (null == nextId) {
break;
} else {
startId = nextId;
long startMill2 = System.currentTimeMillis();
ids = mapper.scanId(startId, limit);
long endMill2 = System.currentTimeMillis();
scanTotal = scanTotal + (endMill2 - startMill2);
endId = ids.size() == 0 ? maxId : ids.get(0);
nextId = ids.size() > 1 ? ids.get(1) : null;
}
long sleep = RandomUtil.randomLong(100, 200);
if (ThreadLocalRandom.current().nextInt(100) < 20) {
log.info("【[随机抽样打印]根据主键循环偏移删除数据】已循环{}次,当前删除{}条,已删除{}条,耗时:{},休眠:{}",
counter, num, total, (endMill - startMill), sleep);
}
Thread.sleep(sleep);
counter++;
sleepTotal = sleepTotal + sleep;
}
long e = System.currentTimeMillis();
log.info("根据主键循环偏移删除,循环删除次数:{},删除总数:{},休眠总耗时:{},删除总耗时:{},偏移获取总耗时:{},总耗时:{}",
counter, total, sleepTotal, delTotal, scanTotal, (e - s));
}
SELECT `id`
FROM `table`
WHERE `id` > ${minId}
ORDER BY `id`
LIMIT ${limit - 2}, 2
方案 |
耗时 |
方案一 |
38分钟 |
方案二 |
由于雪花ID主键的自增不连续,20分钟删不到2w行 |
方案三 |
5分钟(休眠时间100至200) |