基于上一篇文章http://blog.csdn.net/simonchi/article/details/42520193 相对比较细致的分析后,该文章将对LoadBalancingSinkProcessor源码进行选择性的重要逻辑代码进行讲解
首先读取配置,当然是重写congifure方法
public void configure(Context context) { Preconditions.checkState(getSinks().size() > 1, "The LoadBalancingSinkProcessor cannot be used for a single sink. " + "Please configure more than one sinks and try again."); String selectorTypeName = context.getString(CONFIG_SELECTOR, SELECTOR_NAME_ROUND_ROBIN); Boolean shouldBackOff = context.getBoolean(CONFIG_BACKOFF, false); selector = null; if (selectorTypeName.equalsIgnoreCase(SELECTOR_NAME_ROUND_ROBIN)) { selector = new RoundRobinSinkSelector(shouldBackOff); } else if (selectorTypeName.equalsIgnoreCase(SELECTOR_NAME_RANDOM)) { selector = new RandomOrderSinkSelector(shouldBackOff); } else { try { @SuppressWarnings("unchecked") Class<? extends SinkSelector> klass = (Class<? extends SinkSelector>) Class.forName(selectorTypeName); selector = klass.newInstance(); } catch (Exception ex) { throw new FlumeException("Unable to instantiate sink selector: " + selectorTypeName, ex); } } selector.setSinks(getSinks()); selector.configure( new Context(context.getSubProperties(CONFIG_SELECTOR_PREFIX))); LOGGER.debug("Sink selector: " + selector + " initialized"); }该方法最重要的就是后米娜实例化了selector对象,通过Class.forName去加载该类,就是配置文件中selector对应的值
selector.setSinks(getSinks());这行是给该selector设置了sinks,sinks是从配置文件中sinks读取的
再来看核心逻辑process()方法
public Status process() throws EventDeliveryException { Status status = null; Iterator<Sink> sinkIterator = selector.createSinkIterator(); while (sinkIterator.hasNext()) { Sink sink = sinkIterator.next(); try { status = sink.process(); break; } catch (Exception ex) { selector.informSinkFailed(sink); LOGGER.warn("Sink failed to consume event. " + "Attempting next sink if available.", ex); } } if (status == null) { throw new EventDeliveryException("All configured sinks have failed"); } return status; }首先是创建了一个sink遍历器
我们来看下round_robin的实现
public Iterator<T> createIterator() { List<Integer> activeIndices = getIndexList(); int size = activeIndices.size(); // possible that the size has shrunk so gotta adjust nextHead for that if (nextHead >= size) { nextHead = 0; } int begin = nextHead++; if (nextHead == activeIndices.size()) { nextHead = 0; } int[] indexOrder = new int[size]; for (int i = 0; i < size; i++) { indexOrder[i] = activeIndices.get((begin + i) % size); } return new SpecificOrderIterator<T>(indexOrder, getObjects()); }1、首先获取当前有效的sink
for (int i = 0; i < size; i++) { indexOrder[i] = activeIndices.get((begin + i) % size); }这其实就是round_robin的算法
while (sinkIterator.hasNext()) { Sink sink = sinkIterator.next(); try { status = sink.process(); break; } catch (Exception ex) { selector.informSinkFailed(sink); LOGGER.warn("Sink failed to consume event. " + "Attempting next sink if available.", ex); } }循环内部一次选择有效的sink进行处理
异常部分,我们发现触发了informSinkFailed()方法,我们来看看该方法
public void informFailure(T failedObject) { //If there is no backoff this method is a no-op. if (!shouldBackOff) { return; } FailureState state = stateMap.get(failedObject); long now = System.currentTimeMillis(); long delta = now - state.lastFail; /* * When do we increase the backoff period? * We basically calculate the time difference between the last failure * and the current one. If this failure happened within one hour of the * last backoff period getting over, then we increase the timeout, * since the object did not recover yet. Else we assume this is a fresh * failure and reset the count. */ long lastBackoffLength = Math.min(maxTimeout, 1000 * (1 << state.sequentialFails)); long allowableDiff = lastBackoffLength + CONSIDER_SEQUENTIAL_RANGE; if (allowableDiff > delta) { if (state.sequentialFails < EXP_BACKOFF_COUNTER_LIMIT) { state.sequentialFails++; } } else { state.sequentialFails = 1; } state.lastFail = now; //Depending on the number of sequential failures this component had, delay //its restore time. Each time it fails, delay the restore by 1000 ms, //until the maxTimeOut is reached. state.restoreTime = now + Math.min(maxTimeout, 1000 * (1 << state.sequentialFails)); }实现如上:
当然该方法是针对配置中选择惩罚的机制,也就是backoff=true,所以第一行,如果你配置的不选择惩罚,当然就不会执行该方法了;
惩罚机制:
首先选择该失败sink,读取其是否有过失败记录,因为失败的sink有状态记录的,就是代码中那些字段
这些状态的变来变去,读者自行阅读吧,就是个惩罚机制,没什么好说的!!!
这样总的看起来,flume内置的两种SinkProcessor其实没什么东西,只要你按照这种结构,就可以开发自己的sinkprocessor了