1. 问题
使用kylin加载hive的表时出错,报错如下:
2018-01-25 15:55:47,581 TRACE [http-bio-7070-exec-5] hbase.HBaseResourceStore:311 : Update row /table_exd/NLOGS.BRO_DHCP.json from oldTs: 0, to newTs: 1516866947530, operation result: false
2018-01-25 15:55:47,583 ERROR [http-bio-7070-exec-5] controller.TableController:118 : Failed to load Hive Table
java.lang.IllegalStateException: Overwriting conflict /table_exd/NLOGS.BRO_DHCP.json, expect old TS 0, but it is 1516183299747
at org.apache.kylin.storage.hbase.HBaseResourceStore.checkAndPutResourceImpl(HBaseResourceStore.java:315)
at org.apache.kylin.common.persistence.ResourceStore.checkAndPutResourceCheckpoint(ResourceStore.java:294)
at org.apache.kylin.common.persistence.ResourceStore.putResource(ResourceStore.java:280)
at org.apache.kylin.common.persistence.ResourceStore.putResource(ResourceStore.java:260)
at org.apache.kylin.metadata.MetadataManager.saveTableExt(MetadataManager.java:241)
at org.apache.kylin.rest.service.TableService.loadHiveTablesToProject(TableService.java:166)
at org.apache.kylin.rest.service.TableService$$FastClassBySpringCGLIB$$4a7fb179.invoke()
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:720)
2. 分析原因
容易看出,主要错误是将元数据写入HBase时,TS时间戳冲突,推断是HBase中已经存在了该表的信息,所以,需要知道该信息存储在HBase的那一张表,记录RowKey分别是什么。
在错误信息中已经表明了错误发生的位置:
org.apache.kylin.storage.hbase.HBaseResourceStore.checkAndPutResourceImpl(HBaseResourceStore.java:315)
我使用的Kylin版本是2.1,所以,到GitHub上找到kylin项目,选择分支“2.1.x”,找到文件HBaseResourceStore.java的315行,如下所示:
@Override
protected long checkAndPutResourceImpl(String resPath, byte[] content, long oldTS, long newTS)
throws IOException, IllegalStateException {
Table table = getConnection().getTable(TableName.valueOf(tableName));
try {
byte[] row = Bytes.toBytes(resPath);
byte[] bOldTS = oldTS == 0 ? null : Bytes.toBytes(oldTS);
Put put = buildPut(resPath, newTS, row, content, table);
boolean ok = table.checkAndPut(row, B_FAMILY, B_COLUMN_TS, bOldTS, put);
logger.trace("Update row " + resPath + " from oldTs: " + oldTS + ", to newTs: " + newTS
+ ", operation result: " + ok);//311 - 312行
if (!ok) {
long real = getResourceTimestampImpl(resPath);
throw new IllegalStateException( //315行
"Overwriting conflict " + resPath + ", expect old TS " + oldTS + ", but it is " + real);
}
return newTS;
} finally {
IOUtils.closeQuietly(table);
}
}
检测oldTS的时候,不OK了,所以报错,从此处可以看出,resPath直接作为Rowkey,从上方错误信息中的Trace输出以及代码的311-312行能够看到resPath=”/table_exd/NLOGS.BRO_DHCP.json”:
Update row /table_exd/NLOGS.BRO_DHCP.json …
此时,还需要Hbase表名,依然在该文件中,87-107行代码如下:
public HBaseResourceStore(KylinConfig kylinConfig) throws IOException {
super(kylinConfig);
metadataUrl = buildMetadataUrl(kylinConfig);
tableName = metadataUrl.getIdentifier();
createHTableIfNeeded(tableName);
}
private StorageURL buildMetadataUrl(KylinConfig kylinConfig) throws IOException {
StorageURL url = kylinConfig.getMetadataUrl();
if (!url.getScheme().equals("hbase"))
throw new IOException("Cannot create HBaseResourceStore. Url not match. Url: " + url);
// control timeout for prompt error report
Map newParams = new LinkedHashMap<>();
newParams.put("hbase.client.scanner.timeout.period", "10000");
newParams.put("hbase.rpc.timeout", "5000");
newParams.put("hbase.client.retries.number", "1");
newParams.putAll(url.getAllParameters());
return url.copy(newParams);
}
容易看出,tableName在StorageURL类中生成,所以找到该类,在97行有一句如下:
this.identifier = n.isEmpty() ? "kylin_metadata" : n;
所以到此就能确定
HBase表名:kylin_metadata, rowKey:/table_exd/NLOGS.BRO_DHCP.json
到Habse shell中执行
get “kylin_metadata”,”/table_exd/NLOGS.BRO_DHCP.json”
能够看到返回了结果,表的元数据已经存储了,但是Kylin没有正常识别,所以多次Load,在写入HBase时,则发生了冲突,主要是因为Kylin检查了时间戳,时间戳不匹配,则不更新。
3. 解决方案
知道原因,解决方式就很简单了,直接删除该条记录,重新加载hive表即可,执行命令如下:
deleteall “kylin_metadata”,”/table_exd/NLOGS.BRO_DHCP.json”