在进一步介绍之前,我们先通过KafkaConsumer的构造函数,看一下其核心部件:
private KafkaConsumer(ConsumerConfig config,
Deserializer keyDeserializer,
Deserializer valueDeserializer) {
try {
...
//Metadata
this.metadata = new Metadata(retryBackoffMs, config.getLong(ConsumerConfig.METADATA_MAX_AGE_CONFIG));
...
//NetworkClient
ChannelBuilder channelBuilder = ClientUtils.createChannelBuilder(config.values());
NetworkClient netClient = new NetworkClient(
new Selector(config.getLong(ConsumerConfig.CONNECTIONS_MAX_IDLE_MS_CONFIG), metrics, time, metricGrpPrefix, metricsTags, channelBuilder),
this.metadata,
clientId,
100, // a fixed large enough value will suffice
config.getLong(ConsumerConfig.RECONNECT_BACKOFF_MS_CONFIG),
config.getInt(ConsumerConfig.SEND_BUFFER_CONFIG),
config.getInt(ConsumerConfig.RECEIVE_BUFFER_CONFIG),
config.getInt(ConsumerConfig.REQUEST_TIMEOUT_MS_CONFIG), time);
...
//ConsumerNetworkClient
this.client = new ConsumerNetworkClient(netClient, metadata, time, retryBackoffMs);
OffsetResetStrategy offsetResetStrategy = OffsetResetStrategy.valueOf(config.getString(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG).toUpperCase());
...
//SubscriptionState
this.subscriptions = new SubscriptionState(offsetResetStrategy);
List assignors = config.getConfiguredInstances(
ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG,
PartitionAssignor.class);
...
//ConsumerCoordinator
this.coordinator = new ConsumerCoordinator(this.client,
config.getString(ConsumerConfig.GROUP_ID_CONFIG),
config.getInt(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG),
config.getInt(ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG),
assignors,
this.metadata,
this.subscriptions,
metrics,
metricGrpPrefix,
metricsTags,
this.time,
retryBackoffMs,
new ConsumerCoordinator.DefaultOffsetCommitCallback(),
config.getBoolean(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG),
config.getLong(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG));
if (keyDeserializer == null) {
this.keyDeserializer = config.getConfiguredInstance(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
Deserializer.class);
...
//Fetcher
this.fetcher = new Fetcher<>(this.client,
config.getInt(ConsumerConfig.FETCH_MIN_BYTES_CONFIG),
config.getInt(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG),
config.getInt(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG),
config.getBoolean(ConsumerConfig.CHECK_CRCS_CONFIG),
this.keyDeserializer,
this.valueDeserializer,
this.metadata,
this.subscriptions,
metrics,
metricGrpPrefix,
metricsTags,
this.time,
this.retryBackoffMs);
...
}
从上面代码可以看出,KafkaConsumer有以下几个核心部件:
(1)Metedata //同KafkaProducer
(2)NetworkClient //同KafkaProducer
(3)ConsumerNetworkClient //对NetworkClient的封装,后面再详细讲述
(4)ConsumerCoordinator //上1篇所讲的,负责partitiion的分配,reblance
(5)SubscriptionState //订阅的TopicPartition的offset状态维护
(6)Fetcher //获取消息
本篇将对Fetcher + SubscriptionState进行详细讲述。
下面分析一下consumer poll的整个流程,这其中也就包含了Fetcher的流程:
private Map<TopicPartition, List<ConsumerRecord<K, V>>> pollOnce(long timeout) {
//上一篇已经分析:确保先找到coordinator
coordinator.ensureCoordinatorKnown();
// 上一篇已经分析:每个consumer得到自己分配到的partition
if (subscriptions.partitionsAutoAssigned())
coordinator.ensurePartitionAssignment();
// offset的初始化问题:本篇来分析
if (!subscriptions.hasAllFetchPositions())
updateFetchPositions(this.subscriptions.missingFetchPositions());
Cluster cluster = this.metadata.fetch();
Map<TopicPartition, List<ConsumerRecord<K, V>>> records = fetcher.fetchedRecords();
if (!records.isEmpty()) {
return records;
}
//fetcher的3个步骤
//步骤1:生成FetchRequest,并放入发送队列
fetcher.initFetches(cluster);
//步骤2:网络poll
client.poll(timeout);
//步骤3:取结果
return fetcher.fetchedRecords();
}
当consumer初次启动的时候,面临的一个首要问题就是:从offset为多少的位置开始消费。关于这个,有2种策略:
调用seek(TopicPartition, offset),然后开始poll
poll之前,给集群发送请求,让集群告知客户端,当前该TopicPartition的offset是多少。这也就是上面的代码:
if (!subscriptions.hasAllFetchPositions())
updateFetchPositions(this.subscriptions.missingFetchPositions());
下面看一下该函数的内部细节:
//遍历所有的TopicPartition,如果其状态hasValidPosition = false,说明此时客户端不知道其offset,需要向服务器请求
public boolean hasAllFetchPositions() {
for (TopicPartitionState state : assignment.values())
if (!state.hasValidPosition)
return false;
return true;
}
private void updateFetchPositions(Set partitions) {
coordinator.refreshCommittedOffsetsIfNeeded();
fetcher.updateFetchPositions(partitions);
}
//Coordinator
public void refreshCommittedOffsetsIfNeeded() {
if (subscriptions.refreshCommitsNeeded()) {
Map offsets = fetchCommittedOffsets(subscriptions.assignedPartitions()); //关键函数
for (Map.Entry entry : offsets.entrySet()) {
TopicPartition tp = entry.getKey();
// verify assignment is still active
if (subscriptions.isAssigned(tp))
this.subscriptions.committed(tp, entry.getValue());
}
this.subscriptions.commitsRefreshed();
}
}
public Map fetchCommittedOffsets(Set partitions) {
while (true) {
ensureCoordinatorKnown();
// contact coordinator to fetch committed offsets
RequestFuture
可以看到,上面关键是发了一个OffsetFetchRequest,并且是同步调用,直到获取到初始的offset,再开始接下来的poll
通过上面的步骤,consumer的每个TopicPartition都有了初始的offset,接下来就可以进行不断循环取消息了,这也就是Fetch的过程:
在步骤1中,核心就是生成FetchRequest:
假设一个consumer订阅了3个topic: t0, t1, t2,为其分配的partition分别是:
t0: p0;
t1: p1, p2;
t2: p2
即总共4个TopicPartition,即t0p0, t0p1, t1p1, t2p2。这4个TopicPartition可能分布在2台机器n0, n1上面:
n0: t0p0, t1p1
n1: t0p1, t2p2
则会分别针对每台机器生成一个FetchRequest,即Map
public void initFetches(Cluster cluster) {
for (Map.Entry fetchEntry: createFetchRequests(cluster).entrySet()) {
final FetchRequest fetch = fetchEntry.getValue();
client.send(fetchEntry.getKey(), ApiKeys.FETCH, fetch)
.addListener(new RequestFutureListener() {
@Override
public void onSuccess(ClientResponse response) {
handleFetchResponse(response, fetch);
}
@Override
public void onFailure(RuntimeException e) {
log.debug("Fetch failed", e);
}
});
}
}
//关键函数:把所有属于同一个Node的TopicPartition放在一起,生成一个FetchRequest
private Map createFetchRequests(Cluster cluster) {
// create the fetch info
Map> fetchable = new HashMap<>();
for (TopicPartition partition : subscriptions.fetchablePartitions()) {
Node node = cluster.leaderFor(partition);
if (node == null) {
metadata.requestUpdate();
} else if (this.client.pendingRequestCount(node) == 0) {
// if there is a leader and no in-flight requests, issue a new fetch
Map fetch = fetchable.get(node);
if (fetch == null) {
fetch = new HashMap<>();
fetchable.put(node, fetch);
}
long position = this.subscriptions.position(partition);
fetch.put(partition, new FetchRequest.PartitionData(position, this.fetchSize));
log.trace("Added fetch request for partition {} at offset {}", partition, position);
}
}
// create the fetches
Map requests = new HashMap<>();
for (Map.Entry> entry : fetchable.entrySet()) {
Node node = entry.getKey();
FetchRequest fetch = new FetchRequest(this.maxWaitMs, this.minBytes, entry.getValue());
requests.put(node, fetch);
}
return requests;
}
这里要注意,在步骤1构造FetchRequest的时候,已经在其callback函数中处理了返回的Response,即把返回结果加入了Fetcher的成员变量records里面:
public class Fetcher {
...
//每个TopicPartition对应一个ConsumerRecordList + 一个fetchOffset
private final List> records;
...
}
private static class PartitionRecords {
public long fetchOffset;
public TopicPartition partition;
public List> records;
public PartitionRecords(long fetchOffset, TopicPartition partition, List> records) {
this.fetchOffset = fetchOffset;
this.partition = partition;
this.records = records;
}
}
这里的fetchedRecords,主要是返回records变量,同时把offset置到nextOffset.
在前面我们提到了,消费完消息之后,客户端需要向服务器发送消息确认,确认有2中策略:
手动调用KafkaConsumer的2个函数:
public void commitSync() ; //同步确认
public void commitAsync(OffsetCommitCallback callback); //异步确认
public void commitOffsetsSync(Map offsets) {
if (offsets.isEmpty())
return;
while (true) {
ensureCoordinatorKnown();
RequestFuture future = sendOffsetCommitRequest(offsets); //发送offsetCommit请求
client.poll(future); //同步调用
if (future.succeeded())
return;
if (!future.isRetriable())
throw future.exception();
time.sleep(retryBackoffMs);
}
}
public void commitOffsetsAsync(final Map offsets, OffsetCommitCallback callback) {
this.subscriptions.needRefreshCommits();
RequestFuture future = sendOffsetCommitRequest(offsets);
final OffsetCommitCallback cb = callback == null ? defaultOffsetCommitCallback : callback; //异步
future.addListener(new RequestFutureListener() {
@Override
public void onSuccess(Void value) {
cb.onComplete(offsets, null);
}
@Override
public void onFailure(RuntimeException e) {
cb.onComplete(offsets, e);
}
});
}
客户端不做确认,由client内部,周期性的发送确认消息,类似HeartBeat,其实现机制也就是前面所讲的DelayedQueue + DelayedTask
private class AutoCommitTask implements DelayedTask {
private final long interval;
private boolean enabled = false;
private boolean requestInFlight = false;
public AutoCommitTask(long interval) {
this.interval = interval;
}
public void enable() {
if (!enabled) {
client.unschedule(this);
this.enabled = true;
if (!requestInFlight) {
long now = time.milliseconds();
client.schedule(this, interval + now);
}
}
}
public void disable() {
this.enabled = false;
client.unschedule(this);
}
private void reschedule(long at) {
if (enabled)
client.schedule(this, at);
}
public void run(final long now) {
if (!enabled)
return;
if (coordinatorUnknown()) {
log.debug("Cannot auto-commit offsets now since the coordinator is unknown, will retry after backoff");
client.schedule(this, now + retryBackoffMs);
return;
}
requestInFlight = true;
commitOffsetsAsync(subscriptions.allConsumed(), new OffsetCommitCallback() {
@Override
public void onComplete(Map offsets, Exception exception) {
requestInFlight = false;
if (exception == null) {
reschedule(now + interval); //在回调里面,再次放入AutoCommitTask
} else if (exception instanceof SendFailedException) {
log.debug("Failed to send automatic offset commit, will retry immediately");
reschedule(now);
} else {
log.warn("Auto offset commit failed: {}", exception.getMessage());
reschedule(now + interval);
}
}
});
}
}