Hlog的相关处理流程不完全分析

 

Hlog的相关处理流程:

 

在对hbase中数据进行更新操作put/delete/append/increment操作时,记录操作日志供日志重播的相关处理。

 

Hlog的写入通过regionserver实例生成时生成的FSLog为的实例。

 

通过LogRoller线程定期去检查并删除过期的日志文件。

 

定期检查的时间间隔通过hbase.regionserver.logroll.period进行配置,默认为3600000ms

 

如果检查时间还没有达到上面的间隔时间时,线程等待的wake(唤醒)时间,hbase.server.thread.wakefrequency

 

默认为10*1000ms

 

FSLog实例内部有一个FSLog.LogSyncer线程实例,并启动此实例。此线程主要用来把log写入到hdfs

 

此线程的执行间隔通过hbase.regionserver.optionallogflushinterval配置,默认为1000ms

 

如果此值配置为小于0的值,表示实时写入hlog日志

 

 

 

在执行更新操作时如果开启有日志记录,调用appendNoSync-->append写入日志,

每一个日志中记录的seqid默认第一次时为当前rs中所有region中最大的seqid加一(FSHLog.logSeqNum),

 

每append一次后,logSeqNum的值为加一。同时此值也是flsuh时hfile中的fileinfo中记录的最大的seqid值。

 

此方法把要记录的日志写入到FSLog.LogSyncer.pendingWrites队列中。等待LogSyncerrun方法去sync

 

注意:如果是对metalog写入时,每一次写入都会执行sync操作,保证meta数据的不丢失。

 

 

 

publicvoidrun() {

 

try {

 

// awaiting with a timeout doesn't always

 

// throw exceptions on interrupt

 

while(!this.isInterrupted() && !closeLogSyncer.get()) {

 

 

 

try {

 

如果没有最新的数据要写入到HDFS,现在未flushlog的值小于或等于上一次提交的值,表示没有新记录

 

unflushedEntries在每append一次值会加一。

 

 

 

if (unflushedEntries.get() <= syncedTillHere) {

 

synchronized (closeLogSyncer) {

 

线程等待

 

closeLogSyncer.wait(this.optionalFlushInterval);

 

}

 

}

 

// Calling sync since we waited or had unflushed entries.

 

// Entries appended but not sync'd are taken care of here AKA

 

// deferred log flush

 

否则,执行log的写入HDFS操作。使用unflushedEntries的值当txid

 

sync();

 

} catch (IOException e) {

 

LOG.error("Error while syncing, requesting close of hlog ", e);

 

requestLogRoll();

 

Threads.sleep(this.optionalFlushInterval);

 

}

 

}

 

} catch (InterruptedException e) {

 

LOG.debug(getName() + " interrupted while waiting for sync requests");

 

} finally {

 

LOG.info(getName() + " exiting");

 

}

 

}

 

 

 

使用unflushedEntries的值当成txid

 

private void syncer() throws IOException {

 

syncer(this.unflushedEntries.get()); // sync all pending items

 

}

 

 

 

 

 

private void syncer(long txid) throws IOException {

 

// if the transaction that we are interested in is already

 

// synced, then return immediately.

 

检查日志是否有更新,如果没有,直接返回,减少检查过程

 

if (txid <= this.syncedTillHere) {

 

return;

 

}

 

WritertempWriter;

 

synchronized (this.updateLock) {

 

if (this.closed) return;

 

// Guaranteed non-null.

 

// Note that parallel sync can close tempWriter.

 

// The current method of dealing with this is to catch exceptions.

 

// See HBASE-4387, HBASE-5623, HBASE-7329.

 

tempWriter = this.writer;

 

}

 

try {

 

longdoneUpto;

 

longnow = EnvironmentEdgeManager.currentTimeMillis();

 

// First flush all the pending writes to HDFS. Then

 

// issue the sync to HDFS. If sync is successful, then update

 

// syncedTillHere to indicate that transactions till this

 

// number has been successfully synced.

 

IOException ioe = null;

 

List<Entry> pending = null;

 

synchronized (flushLock) {

 

if (txid <= this.syncedTillHere) {

 

return;

 

}

 

得到当前未flush的最新的txid

 

doneUpto = this.unflushedEntries.get();

 

得到需要写入到HDFS的日志记录列表

 

pending = logSyncer.getPendingWrites();

 

try {

 

添加到HLOG的写入OUTPUT

 

logSyncer.hlogFlush(tempWriter, pending);

 

postAppend(pending);

 

} catch(IOException io) {

 

ioe = io;

 

LOG.error("syncer encountered error, will retry. txid=" + txid, ioe);

 

}

 

}

 

if (ioe != null && pending != null) {

 

synchronized (this.updateLock) {

 

synchronized (flushLock) {

 

// HBASE-4387, HBASE-5623, retry with updateLock held

 

tempWriter = this.writer;

 

logSyncer.hlogFlush(tempWriter, pending);

 

postAppend(pending);

 

}

 

}

 

}

 

// another thread might have sync'ed avoid double-sync'ing

 

if (txid <= this.syncedTillHere) {

 

return;

 

}

 

try {

 

if (tempWriter != null) {

 

tempWriter.sync();

 

postSync();

 

}

 

} catch(IOException ex) {

 

synchronized (this.updateLock) {

 

// HBASE-4387, HBASE-5623, retry with updateLock held

 

// TODO: we don't actually need to do it for concurrent close - what is the point

 

// of syncing new unrelated writer? Keep behavior for now.

 

tempWriter = this.writer;

 

if (tempWriter != null) {

 

执行HDFS的文件写入sync,表示数据持久化

 

tempWriter.sync();

 

postSync();

 

}

 

}

 

}

 

更新当前syncedTillHere的值为unflushedEntries的值,主要用来检查下次是否需要更新记录

 

 

 

this.syncedTillHere = Math.max(this.syncedTillHere, doneUpto);

 

 

 

this.metrics.finishSync(EnvironmentEdgeManager.currentTimeMillis() - now);

 

// TODO: preserving the old behavior for now, but this check is strange. It's not

 

// protected by any locks here, so for all we know rolling locks might start

 

// as soon as we enter the "if". Is this best-effort optimization check?

 

if (!this.logRollRunning) {

 

checkLowReplication();

 

try {

 

如果当前HLOG文件的大小超过了指定的log文件大小,

 

通过hbase.regionserver.hlog.blocksize配置,默认hdfsblocksize大小 *

 

hbase.regionserver.logroll.multiplier的值,默认0.95

 

curLogSize = tempWriter.getLength();

 

if (curLogSize > this.logrollsize) {

 

调用LogRoller.logRollRequested方法

 

requestLogRoll();

 

}

 

} catch (IOException x) {

 

LOG.debug("Log roll failed and will be retried. (This is not an error)");

 

}

 

}

 

} catch (IOException e) {

 

LOG.fatal("Could not sync. Requesting roll of hlog", e);

 

requestLogRoll();

 

throwe;

 

}

 

}

 

 

 

LogRoller.logRollRequested方法流程:

 

此方法主要用来叫醒LogRoller线程本身。

 

public void logRollRequested() {

 

synchronized (rollLog) {

 

rollLog.set(true);

 

rollLog.notifyAll();

 

}

 

}

 

 

 

LogRoller.run方法:

 

public void run() {

 

while (!server.isStopped()) {

 

longnow = System.currentTimeMillis();

 

booleanperiodic = false;

 

如果是Hlog写入的时候显示调用了 logRollRequested方法,下面的if 不进去

 

if (!rollLog.get()) {

 

检查上次更新时间是否大过更新间隔时间,如果没有,线程等待

 

通过hbase.regionserver.logroll.period进行配置,默认为3600000ms

 

periodic = (now - this.lastrolltime) > this.rollperiod;

 

如果是定期检查,同时上一次roll log的时间还不到间隔的时间,线程等待

 

线程等待超时通过hbase.server.thread.wakefrequency,默认为10*1000ms

 

if (!periodic) {

 

synchronized (rollLog) {

 

try {

 

rollLog.wait(this.threadWakeFrequency);

 

} catch (InterruptedException e) {

 

// Fall through

 

}

 

}

 

continue;

 

}

 

// Time for periodic roll

 

if (LOG.isDebugEnabled()) {

 

LOG.debug("Hlog roll period " + this.rollperiod + "ms elapsed");

 

}

 

} elseif (LOG.isDebugEnabled()) {

 

LOG.debug("HLog roll requested");

 

}

 

rollLock.lock(); // FindBugs UL_UNRELEASED_LOCK_EXCEPTION_PATH

 

try {

 

更新最后一次roll log的时间为当前时间

 

this.lastrolltime = now;

 

// This is array of actual region names.

 

调用FSHLog.rollWriter方法,检查文件个数是否达到指定的值,

 

如果是:把所有region已经flushhlog文件移动到oldWALs目录下。

 

并取出未移动同时已经关闭writerhlog file的最小的一个seqidhlog path,

 

把所有未flushregionseqid小于此值的region返回回来。

 

byte [][] regionsToFlush = getWAL().rollWriter(rollLog.get());

 

if (regionsToFlush != null) {

 

发起flush请求,强制所有的region进行flush操作。

 

for (byte [] r: regionsToFlush) scheduleFlush(r);

 

}

 

} catch (FailedLogCloseException e) {

 

server.abort("Failed log close in log roller", e);

 

} catch (java.net.ConnectException e) {

 

server.abort("Failed log close in log roller", e);

 

} catch (IOException ex) {

 

// Abort if we get here. We probably won't recover an IOE. HBASE-1132

 

server.abort("IOE in log roller",

 

RemoteExceptionHandler.checkIOException(ex));

 

} catch (Exception ex) {

 

LOG.error("Log rolling failed", ex);

 

server.abort("Log rolling failed", ex);

 

} finally {

 

try {

 

rollLog.set(false);

 

} finally {

 

rollLock.unlock();

 

}

 

}

 

}

 

LOG.info("LogRoller exiting.");

 

}

 

 

 

FSHLog.rollWriter方法流程分析:

 

把所有的region中已经flush完成的最小的seqidoldhlog文件进行比对,

 

如果old hlogseqid小于指定的flushid把这些hlog移动到oldWALs目录下去。

 

如果关闭的hlog文件大于hbase.regionserver.maxlogs配置的的值,默认为32个,

 

返回所有的已经关闭的hlog中所有region未进行flushseqid,

 

 

 

 

 

publicbyte [][] rollWriter(booleanforce)

 

throws FailedLogCloseException, IOException {

 

synchronized (rollWriterLock) {

 

// Return if nothing to flush.

 

FSHLog中的numEntries的值每次append时,都会加一

 

if (!force && this.writer != null && this.numEntries.get() <= 0) {

 

returnnull;

 

}

 

byte [][] regionsToFlush = null;

 

if (closed) {

 

LOG.debug("HLog closed. Skipping rolling of writer");

 

returnnull;

 

}

 

try {

 

....................此处部分代码没有显示

 

 

 

filenum在第一次是值为-1,第一次进行入是在FSHLog实例生成时通过此方法得到Writer实例

 

longcurrentFilenum = this.filenum;

 

Path oldPath = null;

 

如果是第二次指定此方法,也就是LogRoller显示调用时

 

if (currentFilenum > 0) {

 

//computeFilename will take care of meta hlog filename

 

得到上一个hlog文件的路径,也就是准备关闭的文件路径

 

oldPath = computeFilename(currentFilenum);

 

}

 

生成一个新的HLOG文件名称,并创建文件路径

 

this.filenum = System.currentTimeMillis();

 

Path newPath = computeFilename();

 

while (fs.exists(newPath)) {

 

this.filenum++;

 

newPath = computeFilename();

 

}

 

 

 

....................此处部分代码没有显示

 

 

 

生成一个新的Writer实例,默认实现是ProtobufLogWriter,通过hbase.regionserver.hlog.writer.impl配置

 

FSHLog.WriternextWriter = this.createWriterInstance(fs, newPath, conf);

 

// Can we get at the dfsclientoutputstream?

 

FSDataOutputStream nextHdfsOut = null;

 

if (nextWriterinstanceof ProtobufLogWriter) {

 

得到writer对应的HDFS路径的OutputStream

 

nextHdfsOut = ((ProtobufLogWriter)nextWriter).getStream();

 

// perform the costly sync before we get the lock to roll writers.

 

try {

 

nextWriter.sync();

 

} catch (IOException e) {

 

// optimization failed, no need to abort here.

 

LOG.warn("pre-sync failed", e);

 

}

 

}

 

 

 

Path oldFile = null;

 

intoldNumEntries = 0;

 

synchronized (updateLock) {

 

// Clean up current writer.

 

oldNumEntries = this.numEntries.get();

 

关闭上一个HLOGHDFS的文件流,并把writer设置为null

 

同时把关闭的hlog文件的的seqid与路径写入到outputfilesmap列表中

 

oldFile = cleanupCurrentWriter(currentFilenum);

 

把当前新的hlog文件的writeroutputstream设置给FSHLOG实例

 

this.writer = nextWriter;

 

this.hdfs_out = nextHdfsOut;

 

this.numEntries.set(0);

 

}

 

 

 

....................此处部分代码没有显示

 

 

 

....................此处部分代码没有显示

 

 

 

// Can we delete any of the old log files?

 

如果outputfiles列表中有值,检查log的个数是否达到指定的个数,

 

if (getNumRolledLogFiles() > 0) {

 

检查log目录下的文件个数是否达到指定的个数,hbase.regionserver.maxlogs配置的的值,默认为32

 

如果达到指定的个数,找到所有region中已经flush完成的最小的一个seqid,

 

并从outputfiles列表中取出比此seqid小的hlog文件,把这些文件移动到oldWALs路径下。

 

CleanOldLogs();

 

取出outputfiles列表中的最小的一个seqid,

 

检查如果未flushregionseqid是否小于此seqid,如果是返回这些region

 

regionsToFlush = getRegionsToForceFlush();

 

}

 

} finally {

 

this.logRollRunning = false;

 

closeBarrier.endOp();

 

}

 

returnregionsToFlush;

 

}

 

}

 

 

你可能感兴趣的:(分布式,源代码,hbase)