主从模式部署的broker集群,主节点同步模式为:异步复制,刷盘方式为:异步刷盘。
在主节点宕机重启后,从节点的store.log里大量出现了如下日志,并且在不断的输出:
2022-05-11 14:00:55 WARN ReputMessageService - found a illegal magic code 0xe588a9e5
2022-05-11 14:00:55 WARN ReputMessageService - found a illegal magic code 0xe588a9e5
2022-05-11 14:00:55 WARN ReputMessageService - found a illegal magic code 0xe588a9e5
2022-05-11 14:00:55 WARN ReputMessageService - found a illegal magic code 0xe588a9e5
重启从节点后,依然无法恢复。
首先搜索该警告对应的代码,CommitLog.java
部分代码如下:
public DispatchRequest checkMessageAndReturnSize(java.nio.ByteBuffer byteBuffer, final boolean checkCRC, final boolean readBody) {
try {
// 1 TOTAL SIZE
int totalSize = byteBuffer.getInt();
// 2 MAGIC CODE
int magicCode = byteBuffer.getInt();
switch (magicCode) {
case MESSAGE_MAGIC_CODE:
break;
case BLANK_MAGIC_CODE:
return new DispatchRequest(0, true /* success */);
default:
log.warn("found a illegal magic code 0x" + Integer.toHexString(magicCode));
return new DispatchRequest(-1, false /* success */);
}
调用checkMessageAndReturnSize
的为ReputMessageService.java
的如下代码:
class ReputMessageService extends ServiceThread {
public void run() {
while (!this.isStopped()) {
try {
Thread.sleep(1);
this.doReput();
} catch (Exception e) {
... ...
}
}
}
private void doReput() {
for (boolean doNext = true; this.isCommitLogAvailable() && doNext; ) {
SelectMappedBufferResult result = commitLog.getData(reputFromOffset);
if (result != null) {
try {
this.reputFromOffset = result.getStartOffset();
for (int readSize = 0; readSize < result.getSize() && doNext; ) {
DispatchRequest dispatchRequest = commitLog.checkMessageAndReturnSize(result.getByteBuffer(), false, false);
int size = dispatchRequest.getBufferSize() == -1 ? dispatchRequest.getMsgSize() : dispatchRequest.getBufferSize();
if (dispatchRequest.isSuccess()) {
... ...
} else if (!dispatchRequest.isSuccess()) {
if (size > 0) {
log.error("[BUG]read total count not equals msg total size. reputFromOffset={}", reputFromOffset);
this.reputFromOffset += size;
} else {
doNext = false;
// If user open the dledger pattern or the broker is master node,
// it will not ignore the exception and fix the reputFromOffset variable
if (DefaultMessageStore.this.getMessageStoreConfig().isEnableDLegerCommitLog() ||
DefaultMessageStore.this.brokerConfig.getBrokerId() == MixAll.MASTER_ID) {
log.error("[BUG]dispatch message to consume queue error, COMMITLOG OFFSET: {}",
this.reputFromOffset);
this.reputFromOffset += result.getSize() - readSize;
}
}
}
}
} finally {
result.release();
}
} else {
doNext = false;
}
}
}
}
ReputMessageService.java
的作用是根据commitlog构建出consume queue和index文件,即参考如下broker存储的数据文件目录:
data
|-- commitlog
|-- config
|-- consumequeue
|-- index
由此得知,由于ReputMessageService
在循环执行构建任务时,调用checkMessageAndReturnSize
,返回的结果指示为不成功,导致doReput()
无限循环执行。
那么到底是哪个文件导致上面的无限循环呢?
经过走查当时主节点重启后的日志,发现如下关键日志:
2022-05-11 13:50:11 INFO main - recover from this mapped file /data/broker-h/data/commitlog/00000012532714569728
2022-05-11 13:50:11 INFO FlushIndexFileThread - flush index file elapsed time(ms) 117
2022-05-11 13:50:15 WARN main - found a illegal magic code 0x0
2022-05-11 13:50:15 INFO main - recover physics file end, /data/broker-h/data/commitlog/00000012532714569728 pos=995267206
即,broker在重启时,会根据最后一个commitlog恢复相应的consume queue,应该是防止commitlog损坏,进行修复,这里可以看到文件00000012532714569728出现问题。
commitlog数据到底哪里不一样?
首先主从的00000012532714569728文件md5值不一样,需要写代码详细分析,将主从节点的commitlog导出,编写如下代码程序进行测试:
public void testCommitLog() throws IOException {
int fileSize = 1024 * 1024 * 1024;
MappedFile mappedFile = new MappedFile("00000012532714569728", fileSize);
mappedFile.setWrotePosition(fileSize);
SelectMappedBufferResult result = mappedFile.selectMappedBuffer(0);
ByteBuffer byteBuffer = result.getByteBuffer();
int curSize = 0;
int msgNO = 0;
while (byteBuffer.hasRemaining()) {
int msgSize = byteBuffer.getInt();
int magicCode = byteBuffer.getInt();
++msgNO;
System.out.println("msgNO=" + msgNO + ",curSize=" + curSize + ",magicCode=" + magicCode + ",msgSize=" + curSize);
curSize += msgSize;
byteBuffer.position(curSize);
if (magicCode == 0) {
break;
}
}
}
主节点commitlog的测试关键输出如下,部分省略:
msgNO=692026,curSize=995261690,magicCode=-626843481,msgSize=1498
msgNO=692027,curSize=995263188,magicCode=-626843481,msgSize=2953
msgNO=692028,curSize=995266141,magicCode=-626843481,msgSize=1057
msgNO=692029,curSize=995267198,magicCode=0,msgSize=0
即,主节点的commitlog到msgNO=692029条消息后,magicCode和msgSize都为0,即从此之后没有数据了。
从节点commitlog的测试关键输出如下,部分省略:
msgNO=692026,curSize=995261690,magicCode=-626843481,msgSize=1498
msgNO=692027,curSize=995263188,magicCode=-626843481,msgSize=2953
msgNO=692028,curSize=995266141,magicCode=-626843481,msgSize=1057
msgNO=692029,curSize=995267198,magicCode=-626843481,msgSize=3190
msgNO=692030,curSize=995270388,magicCode=-626843481,msgSize=1691
msgNO=692031,curSize=995272079,magicCode=-626843481,msgSize=1627
... ...
msgNO=692084,curSize=995363023,magicCode=-626843481,msgSize=655
msgNO=692085,curSize=995363678,magicCode=-626843481,msgSize=918
msgNO=692086,curSize=995364596,magicCode=-626843481,msgSize=2542
msgNO=692087,curSize=995367138,magicCode=-444028443,msgSize=-1930845766
java.lang.IllegalArgumentException
at java.nio.Buffer.position(Buffer.java:244)
at org.apache.rocketmq.store.MappedFileTest.testCommitLog(MappedFileTest.java:79)
从节点的commitlog到msgNO=692029条消息后,仍然是合法的数据,一直到msgNO=692087后,magicCode和msgSize出现负值。
从这里可以直接得出答案,由于从节点的commitlog存在非法数据,导致ReputMessageService
在构建相关索引文件时,出现无限循环的情况。
从节点的非法数据是如何写进去的?
先看下整个事件的顺序:
根据上面的顺序,问题出现的时间点是第3步,即主节点启动后,从节点从主节点同步数据。
调查主节点的日志,发现如下日志:
2022-05-11 14:00:50 INFO AcceptSocketService - HAService receive new connection, /10.x.x.x:57510
2022-05-11 14:00:50 INFO ReadSocketService - ReadSocketService service started
2022-05-11 14:00:50 INFO WriteSocketService - WriteSocketService service started
2022-05-11 14:00:55 INFO ReadSocketService - slave[/10.x.x.x:57510] request offset 12533709936866
2022-05-11 14:00:55 INFO WriteSocketService - master transfer data from 12533709936866 to slave[/10.x.x.x:57510], and slave request 12533709936866
即从节点请求offset:12533709936866的数据。
而对应从节点的日志,如下:
2022-05-11 14:00:49 INFO BrokerControllerScheduledThread1 - update master address, OLD: 10.x.x.x:10912 NEW: 10.x.x.x:10912
2022-05-11 14:00:55 WARN ReputMessageService - found a illegal magic code 0xe588a9e5
2022-05-11 14:00:55 WARN ReputMessageService - found a illegal magic code 0xe588a9e5
2022-05-11 14:00:55 WARN ReputMessageService - found a illegal magic code 0xe588a9e5
2022-05-11 14:00:55 WARN ReputMessageService - found a illegal magic code 0xe588a9e5
2022-05-11 14:00:55 WARN ReputMessageService - found a illegal magic code 0xe588a9e5
即,主节点在2022-05-11 14:00:55把数据传输至从节点后,从节点就发生了found a illegal magic code
。
查看一下主节点数据同步WriteSocketService.java
的部分代码:
SelectMappedBufferResult selectResult = haService.getDefaultMessageStore().getCommitLogData(this.nextTransferFromWhere);
int size = selectResult.getSize();
if (size > 1024 * 32) {
size = 1024 * 32;
}
selectResult.getByteBuffer().limit(size);
即,最大传输的数据量为32K,也就是说,主节点传输的数据并不能保障是完整的消息数据。
看一下从节点接收数据HAClient.java
的部分代码:
while (true) {
int diff = this.byteBufferRead.position() - this.dispatchPosition;
if (diff >= msgHeaderSize) {
if (diff >= (msgHeaderSize + bodySize)) {
byte[] bodyData = byteBufferRead.array();
int dataStart = this.dispatchPosition + msgHeaderSize;
HAService.this.defaultMessageStore.appendToCommitLog(masterPhyOffset, bodyData, dataStart, bodySize);
}
}
}
即,只要数据包大于等于头数据+数据体大小时,就写入commitlog,从节点并没有对数据进行完整性校验。
想象一下如下情况:
此时,如果主节点commitlog已经完全持久化了,那么当主节点重启后,从节点还有可能获取到完整的数据。
看下本案例中,主从节点的commitlog:00000012532714569728的数据量:
从节点commitlog的消息量比主节点多692086-692028=58条!
也就是说,主节点重启后,从节点将永远无法从主节点拉取到需要的数据,因为从节点的数据多于主节点,主节点的部分数据丢失了!
由于主节点采用了异步刷盘,默认每500ms内凑够4页才会刷一次,所以存在数据丢失的可能。
此问题的本质原因:主从同步过程中,由于数据不一致导致从节点无法正常同步数据。
此问题带来的影响:从节点无法构建consume queue,所以从节点的数据将无法查询,同时失去了主从高可用的意义。
解决方案:
由于从节点的commitlog数据跟主节点的不一致,而commitlog并不支持部分清除数据,故只能在主节点启动后,关闭从节点,删除其commitlog的最后一个数据文件,重启后,从主节点将自动拉取commitlog最后一个的数据文件。