CDH Service Monitor角色错误

Failed to open or create partition
com.cloudera.cmon.tstore.leveldb.LDBPartitionManager$LDBPartitionException: Unable to open DB in directory /var/lib/cloudera-service-monitor/ts/stream/partitions/stream_2019-03-18T12:37:18.736Z for partition LDBPartitionMetadataWrapper{tableName=stream, partitionName=stream_2019-03-18T12:37:18.736Z, startTime=2019-03-18T12:37:18.736Z, endTime=null, version=2, state=CLOSED}
	at com.cloudera.cmon.tstore.leveldb.LDBUtils.openOrCreatePartitionDB(LDBUtils.java:194)
	at com.cloudera.cmon.tstore.leveldb.LDBPartitionManager.getOrOpenInternal(LDBPartitionManager.java:620)
	at com.cloudera.cmon.tstore.leveldb.LDBPartitionManager.openOrCreatePartitionLDB(LDBPartitionManager.java:557)
	at com.cloudera.cmon.tstore.leveldb.LDBPartitionManager.getPartition(LDBPartitionManager.java:451)
	at com.cloudera.cmon.tstore.leveldb.LDBPartitionManager.getPartitionRange(LDBPartitionManager.java:872)
	at com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesStreamTable.read(LDBTimeSeriesStreamTable.java:229)
	at com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesStreamTable.read(LDBTimeSeriesStreamTable.java:420)
	at com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesRawStreamTable.read(LDBTimeSeriesRawStreamTable.java:242)
	at com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesStore.readFromStreamTable(LDBTimeSeriesStore.java:646)
	at com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesStore.read(LDBTimeSeriesStore.java:582)
	at com.cloudera.cmon.tstore.AggregatingTimeSeriesStore.read(AggregatingTimeSeriesStore.java:505)
	at com.cloudera.cmon.kaiser.BulkMetricFetcher.issueQuery(BulkMetricFetcher.java:440)
	at com.cloudera.cmon.kaiser.BulkMetricFetcher.access$000(BulkMetricFetcher.java:45)
	at com.cloudera.cmon.kaiser.BulkMetricFetcher$1.call(BulkMetricFetcher.java:394)
	at com.cloudera.cmon.kaiser.BulkMetricFetcher$1.call(BulkMetricFetcher.java:391)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 8 missing files; e.g.: /var/lib/cloudera-service-monitor/ts/stream/partitions/stream_2019-03-18T12:37:18.736Z/000005.sst
	at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:194)
	at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:212)
	at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
	at com.cloudera.cmon.tstore.leveldb.LDBUtils.openOrCreatePartitionDB(LDBUtils.java:184)
	... 20 more

CDH集群断电,重启后发现CM无法监控集群状况,查看角色日志,发现这个错误。其中通过Caused by可以看到是有8个文件丢失,把/var/lib/cloudera-service-monitor/重命名,再重启CM就可以了。
这是自己搭建的集群,生产上的集群不知道会有什么后果!!! 慎重!!!

你可能感兴趣的:(一些错误,CDH)