基于上一篇文章http://blog.csdn.net/simonchi/article/details/42520193 相对比较细致的分析后,该文章将对LoadBalancingSinkProcessor源码进行选择性的重要逻辑代码进行讲解
首先读取配置,当然是重写congifure方法
public void configure(Context context) {
Preconditions.checkState(getSinks().size() > 1,
"The LoadBalancingSinkProcessor cannot be used for a single sink. "
+ "Please configure more than one sinks and try again.");
String selectorTypeName = context.getString(CONFIG_SELECTOR,
SELECTOR_NAME_ROUND_ROBIN);
Boolean shouldBackOff = context.getBoolean(CONFIG_BACKOFF, false);
selector = null;
if (selectorTypeName.equalsIgnoreCase(SELECTOR_NAME_ROUND_ROBIN)) {
selector = new RoundRobinSinkSelector(shouldBackOff);
} else if (selectorTypeName.equalsIgnoreCase(SELECTOR_NAME_RANDOM)) {
selector = new RandomOrderSinkSelector(shouldBackOff);
} else {
try {
@SuppressWarnings("unchecked")
Class extends SinkSelector> klass = (Class extends SinkSelector>)
Class.forName(selectorTypeName);
selector = klass.newInstance();
} catch (Exception ex) {
throw new FlumeException("Unable to instantiate sink selector: "
+ selectorTypeName, ex);
}
}
selector.setSinks(getSinks());
selector.configure(
new Context(context.getSubProperties(CONFIG_SELECTOR_PREFIX)));
LOGGER.debug("Sink selector: " + selector + " initialized");
}
该方法最重要的就是后面实例化了selector对象,通过Class.forName去加载该类,就是配置文件中selector对应的值
selector.setSinks(getSinks());
这行是给该selector设置了sinks,sinks是从配置文件中sinks读取的
再来看核心逻辑process()方法
public Status process() throws EventDeliveryException {
Status status = null;
Iterator sinkIterator = selector.createSinkIterator();
while (sinkIterator.hasNext()) {
Sink sink = sinkIterator.next();
try {
status = sink.process();
break;
} catch (Exception ex) {
selector.informSinkFailed(sink);
LOGGER.warn("Sink failed to consume event. "
+ "Attempting next sink if available.", ex);
}
}
if (status == null) {
throw new EventDeliveryException("All configured sinks have failed");
}
return status;
}
首先是创建了一个sink遍历器
我们来看下round_robin的实现
public Iterator createIterator() {
List activeIndices = getIndexList();
int size = activeIndices.size();
// possible that the size has shrunk so gotta adjust nextHead for that
if (nextHead >= size) {
nextHead = 0;
}
int begin = nextHead++;
if (nextHead == activeIndices.size()) {
nextHead = 0;
}
int[] indexOrder = new int[size];
for (int i = 0; i < size; i++) {
indexOrder[i] = activeIndices.get((begin + i) % size);
}
return new SpecificOrderIterator(indexOrder, getObjects());
}
1、首先获取当前有效的sink
2、顺序指定好
for (int i = 0; i < size; i++) {
indexOrder[i] = activeIndices.get((begin + i) % size);
}
这其实就是round_robin的算法
while (sinkIterator.hasNext()) {
Sink sink = sinkIterator.next();
try {
status = sink.process();
break;
} catch (Exception ex) {
selector.informSinkFailed(sink);
LOGGER.warn("Sink failed to consume event. "
+ "Attempting next sink if available.", ex);
}
}
循环内部一次选择有效的sink进行处理
异常部分,我们发现触发了informSinkFailed()方法,我们来看看该方法
public void informFailure(T failedObject) {
//If there is no backoff this method is a no-op.
if (!shouldBackOff) {
return;
}
FailureState state = stateMap.get(failedObject);
long now = System.currentTimeMillis();
long delta = now - state.lastFail;
/*
* When do we increase the backoff period?
* We basically calculate the time difference between the last failure
* and the current one. If this failure happened within one hour of the
* last backoff period getting over, then we increase the timeout,
* since the object did not recover yet. Else we assume this is a fresh
* failure and reset the count.
*/
long lastBackoffLength = Math.min(maxTimeout, 1000 * (1 << state.sequentialFails));
long allowableDiff = lastBackoffLength + CONSIDER_SEQUENTIAL_RANGE;
if (allowableDiff > delta) {
if (state.sequentialFails < EXP_BACKOFF_COUNTER_LIMIT) {
state.sequentialFails++;
}
} else {
state.sequentialFails = 1;
}
state.lastFail = now;
//Depending on the number of sequential failures this component had, delay
//its restore time. Each time it fails, delay the restore by 1000 ms,
//until the maxTimeOut is reached.
state.restoreTime = now + Math.min(maxTimeout, 1000 * (1 << state.sequentialFails));
}
实现如上:
当然该方法是针对配置中选择惩罚的机制,也就是backoff=true,所以第一行,如果你配置的不选择惩罚,当然就不会执行该方法了;
惩罚机制:
首先选择该失败sink,读取其是否有过失败记录,因为失败的sink有状态记录的,就是代码中那些字段
这些状态的变来变去,读者自行阅读吧,就是个惩罚机制,没什么好说的!!!
这样总的看起来,flume内置的两种SinkProcessor其实没什么东西,只要你按照这种结构,就可以开发自己的sinkprocessor了