OpenTSDB中的Compaction

Compaction作为一个可选项,如果在配置中启用,则OpenTSDB会启动一个守护线程,在后台周期性地执行compact操作。

OpenTSDB的compact操作,其实质为将一个小时的3600秒数据(对应于HBase中3600个KeyValue),合并为一个KeyValue(包括qualifier以及value),来节约存储资源。因为在HBase中,rowkey的大量重复是低效的,因此compact显得很有必要。

具体而言,OpenTSDB中,采用了ConcurentSkipListMap这一数据结构作为CompactionQueue,该队列存储了所有的插入HBase的DataPoint所对应的rowkey,CompactionThread周期性地(tsd.storage.compation.flush_interval)从这个队列中进行compact操作。执行compact操作还需要满足队列大小大于tsd.storage.compaction.min_flush_threshold,以及只有old enough的row才会被compact.关于Old enough的衡量,在OpenTSDB中被写死为1个小时,注意,这里并非为整点,而是距离当前系统时间的1个小时。

CompactionThread对于每一个old enough的rowkey,会调用TSDB类的get方法来取出整个row的数据(List),然后对这一row进行compact操作。Compact的过程在Compaction类中。

in 2.2, writing to HBase columns via appends is now supported. This can improve both read and write performance in that TSDs will no longer maintain a queue of rows to compact at the end of each hour, thus preventing a massive read and re-write operation in HBase. However due to the way appends operate in HBase, an increase in CPU utilization, store file size and HDFS traffic will occur on the region servers. Make sure to monitor your HBase servers closely.

你可能感兴趣的:(OpenTSDB中的Compaction)