转载请声明原文:http://blog.csdn.net/duck_genuine/article/details/8440125
solrCloud 管理colleciton操作,可能出现的重大Bug
当删除某一个不存在的collection后,就会出现很大问题,任何对collection的命令都会失败,不过这个bug已在4.1解决
https://issues.apache.org/jira/browse/SOLR-4008
OverseerCollectionProcessor 这个类会抛出运行时异常,导致Thread死掉,所以后面的命令都没有接收到
最致命的是这个异常,除非clear掉zookeeper上的数据, overseer collection queue,才可以解决。
修改bug的代码修改很简单,只是catch到这个异常,返回false的操作标识。
@Override
public void run() {
log.info("Process current queue of collection creations");
while (amILeader() && !isClosed) {
try {
byte[] head = workQueue.peek(true);
//if (head != null) { // should not happen since we block above
final ZkNodeProps message = ZkNodeProps.load(head);
final String operation = message.getStr(QUEUE_OPERATION);
boolean success = processMessage(message, operation);
if (!success) {
// TODO: what to do on failure / partial failure
// if we fail, do we clean up then ?
SolrException.log(log, "Collection creation of " + message.getStr("name") + " failed");
}
//}
workQueue.remove();
} catch (KeeperException e) {
if (e.code() == KeeperException.Code.SESSIONEXPIRED
|| e.code() == KeeperException.Code.CONNECTIONLOSS) {
log.warn("Overseer cannot talk to ZK");
return;
}
SolrException.log(log, "", e);
throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, "",
e);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
return;
}
}
}