Spring-Kafka源码解析

文章目录

  • Spring-Kafka
    • kafkaConsumer
      • kafkaConsumer消费者模型
      • spring-kafka consumer实现
      • Consumer Configs
    • kafkaProducer
      • kafkaProducer生产者模型
      • Producer Configs
    • 使用过程中踩的坑
      • 坑1
      • 坑2

Spring-Kafka

kafkaConsumer

kafkaConsumer消费者模型

Spring-Kafka源码解析_第1张图片

spring-kafka消费者模型主要流程:
0. 容器启动

  1. ListenerConsumer,run()->轮询执行消费
    // ListenerConsumer的run()方法
  2. kafkaConsumer拉取消费流程
  • Feacher请求获取器获取请求,通过调用ConsumerNetworkClient中的send方法将Request存储到unset中
  • ConsumerNetworkClient网络客户端执行poll(),调用NetWorkClient的send()方法从unset中获取ClientRequest请求转成RequestSend最终塞进Selector的KafkaChannel通道中,Seletcor.send()从kafka集群拉取待消费数据ConsumerRecords
  • 消费者监听器MessageListener.onMessage()执行用户自定义的实际消费业务逻辑

spring-kafka consumer实现

  1. 实现方式
  • 第一种:基于注解实现(@Listener)
  • 第二种:基于xml实现
  • 分析:两种方式本质都是生成spring bean
  1. 案例分析(我们生产所使用的xml为例)

    
        
            
                
                
                
                
                
                
                
                
                
                
                
            
        
    
    
    
    
        
            
        
    
    
    
    
    
    
    
        
        
        
    
    
     
    
          
        
        
    

    
    
        
        
        
    
    
  • ① consumerProperties bean:consumer的参数的bean(consumer的一些参数设置)
  • ② consumerFactory bean:DefaultKafkaConsumerFactory
public class DefaultKafkaConsumerFactory implements ConsumerFactory{
重要方法1:创建消费者
protected KafkaConsumer createKafkaConsumer(Map configs) {
		return new KafkaConsumer<>(configs, this.keyDeserializer, this.valueDeserializer);
	}

重要方法2:判断是否自动提交
public boolean isAutoCommit() {
		Object auto = this.configs.get(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG);
		return auto instanceof Boolean ? (Boolean) auto
				: auto instanceof String ? Boolean.valueOf((String) auto) : true;
	}
}	
  • ③ 实现自定义的消费者监听器->
    MessageListener类图

MessageListener的底层实现
     MessageListener.onMessage()
     //最后的MessageListener(执行消费的类 是ListenerConsumer的一个成员变量,由containerProperties里指定的kafkaListener对其进行赋值)
     
     //执行onMeessage(List> data)...
     public class KafkaMsgDistributor implements BatchAcknowledgingMessageListener{
         public void onMessage(List> data, Acknowledgment acknowledgment) {
		try {
			for (ConsumerRecord record : data) {
				IKafkaTopicMsgHandler handler = CustomTopicManager.getHandler(record.topic());
				if (handler != null) {
					handler.doHandle(record.topic(), SerializerUtil.readFromByteArray(record.value()));
				}
			}
			try{
                acknowledgment.acknowledge();
            }catch (CommitFailedException e){
                logger.error("commit failed",e);
            }

		} catch (Exception e) {
			logger.error("Distribute kafka msg failed:" + e.getMessage(), e);
			throw new RuntimeException(e);
		}

	}
     }
     
  • ④ 监听容器配置
	    public class ContainerProperties {
		
		    
		    //setPollTimeout(long) ms
		    public static final long DEFAULT_POLL_TIMEOUT = 5_000L;
		    //setShutdownTimeout(long) ms
		    public static final long DEFAULT_SHUTDOWN_TIMEOUT = 10_000L;
		    //setMonitorInterval(int) s
		    public static final int DEFAULT_MONITOR_INTERVAL = 30;
		    
		    //监听
		    private final String[] topics;
		    private final Pattern topicPattern;
		    // topicPartition+initialOffset+positions...
		    private final TopicPartitionInitialOffset[] topicPartitions;
		    // 默认是batch
		    private AckMode ackMode = AckMode.BATCH;
		    //The executor for threads that poll the consumer
		    // 默认是同步
		    private boolean syncCommits = true;
		    private String clientId = "";
		    //到达time提交 TIME模式
		    private long ackTime;
		    //到达count提交 COUNT模式
		    private int ackCount;
		    //消费监听器
		    private Object messageListener;
		    //线程执行器:轮询消费者
		    private AsyncListenableTaskExecutor consumerTaskExecutor;
		    //用户定义的消费者再平衡监听器实现类
		    private ConsumerRebalanceListener consumerRebalanceListener;
		    //停掉容器的超时时间
		    private long shutdownTimeout = DEFAULT_SHUTDOWN_TIMEOUT;
		    //提交回调 默认是记录日志
		    private OffsetCommitCallback commitCallback;
		    //...
		    
		    
		    构造函数:
		    public ContainerProperties(String... topics){
		        Assert.notEmpty(topics, "An array of topics must be provided");
	        	this.topics = Arrays.asList(topics).toArray(new String[topics.length]);
		        this.topicPattern = null;
		        this.topicPartitions = null;
		    }
		    public ContainerProperties(Pattern topicPattern){
		        this.topics = null;
	        	this.topicPattern = topicPattern;
		        this.topicPartitions = null;
		    }
		    public ContainerProperties(TopicPartitionInitialOffset... topicPartitions) {
		        //...
		    }
		    
		    
		    //The offset commit behavior enumeration
		    public enum AckMode {
		        //每处理一条commit一次
		        RECORD
		        //每次poll的时候批量提交一次,频率取决于每次poll的调用频率
		        BATCH(默认)
		        //setAckTime(long) ackTime  间隔多少时间提交 跟自动提交(auto commit interval)没啥区别
		        TIME
		        //setAckCount(int) ackCount 累积达到ackCount次的ack去commit
		        COUNT
		        //上面两者的结合 只要有一个先满足就commit
		        COUNT_TIME
		        //User takes responsibility for acks using an AcknowledgingMessageListener---listener负责ack,但是背后也是批量上去
		        MANUAL
		        //User takes responsibility for acks using an AcknowledgingMessageListener The consumer immediately processes the commit---listner负责ack,每调用一次,就立即commit
		        //我们使用的就是这种
		        MANUAL_IMMEDIATE

		    }
		    
		    //set get方法...
		    
		}
	    
  • ⑤ MessageListenerContainer->
    MessageListenerContainer类图

(1) ConcurrentMessageListenerContainer

public class ConcurrentMessageListenerContainer extends AbstractMessageListenerContainer {
    private final List> containers = new ArrayList();
    private int concurrency = 1;
    
        protected void doStart() {
        if (!this.isRunning()) {
            this.checkTopics();
            ContainerProperties containerProperties = this.getContainerProperties();
            TopicPartitionInitialOffset[] topicPartitions = containerProperties.getTopicPartitions();
            if (topicPartitions != null && this.concurrency > topicPartitions.length) {
                this.logger.warn("When specific partitions are provided, the concurrency must be less than or equal to the number of partitions; reduced from " + this.concurrency + " to " + topicPartitions.length);
                this.concurrency = topicPartitions.length;
            }

            this.setRunning(true);

            //并发度为几 就会new几个KafkaMessageListenerContainer
            for(int i = 0; i < this.concurrency; ++i) {
                KafkaMessageListenerContainer container;
                if (topicPartitions == null) {
                    container = new KafkaMessageListenerContainer(this, this.consumerFactory, containerProperties);
                } else {
                    container = new KafkaMessageListenerContainer(this, this.consumerFactory, containerProperties, this.partitionSubset(containerProperties, i));
                }

                String beanName = this.getBeanName();
                container.setBeanName((beanName != null ? beanName : "consumer") + "-" + i);
                if (this.getApplicationEventPublisher() != null) {
                    container.setApplicationEventPublisher(this.getApplicationEventPublisher());
                }

                container.setClientIdSuffix("-" + i);
                container.setGenericErrorHandler(this.getGenericErrorHandler());
                container.setAfterRollbackProcessor(this.getAfterRollbackProcessor());
                //会调用KafkaMessageListenerContainer的doStart()方法
                container.start();
                this.containers.add(container);
            }
        }

    }

(2) KafkaMessageListenerContainer

ListenerConsumer->ListenerConsumer类图

public class KafkaMessageListenerContainer extends AbstractMessageListenerContainer {
    private final AbstractMessageListenerContainer container;
    private final TopicPartitionInitialOffset[] topicPartitions;
    private volatile KafkaMessageListenerContainer.ListenerConsumer listenerConsumer;
    private volatile ListenableFuture listenerConsumerFuture;
    private GenericMessageListener listener;
    private String clientIdSuffix;
    
    //构造函数
    public KafkaMessageListenerContainer(ConsumerFactory consumerFactory,
			ContainerProperties containerProperties) {
		this(null, consumerFactory, containerProperties, (TopicPartitionInitialOffset[]) null);
	}
	
        //核心方法
    	protected void doStart() {
		if (isRunning()) {
			return;
		}
		if (this.clientIdSuffix == null) { // stand-alone container
			checkTopics();
		}
		ContainerProperties containerProperties = getContainerProperties();
		if (!this.consumerFactory.isAutoCommit()) {
			AckMode ackMode = containerProperties.getAckMode();
			if (ackMode.equals(AckMode.COUNT) || ackMode.equals(AckMode.COUNT_TIME)) {
				Assert.state(containerProperties.getAckCount() > 0, "'ackCount' must be > 0");
			}
			if ((ackMode.equals(AckMode.TIME) || ackMode.equals(AckMode.COUNT_TIME))
					&& containerProperties.getAckTime() == 0) {
				containerProperties.setAckTime(5000);
			}
		}

        //配置文件里面取messageListener
		Object messageListener = containerProperties.getMessageListener();
		Assert.state(messageListener != null, "A MessageListener is required");
		if (containerProperties.getConsumerTaskExecutor() == null) {
			SimpleAsyncTaskExecutor consumerExecutor = new SimpleAsyncTaskExecutor(
					(getBeanName() == null ? "" : getBeanName()) + "-C-");
			containerProperties.setConsumerTaskExecutor(consumerExecutor);
		}
		Assert.state(messageListener instanceof GenericMessageListener, "Listener must be a GenericListener");
		this.listener = (GenericMessageListener) messageListener;

		ListenerType listenerType = ListenerUtils.determineListenerType(this.listener);
		if (this.listener instanceof DelegatingMessageListener) {
			Object delegating = this.listener;
			while (delegating instanceof DelegatingMessageListener) {
				delegating = ((DelegatingMessageListener) delegating).getDelegate();
			}
			listenerType = ListenerUtils.determineListenerType(delegating);
		}
		this.listenerConsumer = new ListenerConsumer(this.listener, listenerType);
		setRunning(true);
		
		//多线程异步任务 ListenerConsuer异步执行的run方法
		this.listenerConsumerFuture = containerProperties
				.getConsumerTaskExecutor()//返回的AsyncListenableTaskExecutor
				.submitListenable(this.listenerConsumer);//之后线程池里面的listenerConsumer轮询执行run()
	}
	
	
    
    //内部类
    private final class ListenerConsumer implements SchedulingAwareRunnable, ConsumerSeekCallback {
        private final ContainerProperties containerProperties = KafkaMessageListenerContainer.this.getContainerProperties();
        private final OffsetCommitCallback commitCallback;
        private final Consumer consumer;
        private final Map> offsets;
        private final GenericMessageListener genericListener;
        private final MessageListener listener;
        //批量消费的Listener(KafkaMsgDistributor...)
        private final BatchMessageListener batchListener;
        private final ListenerType listenerType;
        private final boolean isConsumerAwareListener;
        private final boolean isBatchListener;
        private final boolean wantsFullRecords;
        private final boolean autoCommit;
        private final boolean isManualAck;
        //我们设置的就是这种ACKmode
        private final boolean isManualImmediateAck;
        private final boolean isAnyManualAck;
        private final boolean isRecordAck;
        private final boolean isBatchAck;
        private final BlockingQueue> acks;
        private final BlockingQueue seeks;
        private final ErrorHandler errorHandler;
        private final BatchErrorHandler batchErrorHandler;
        private final PlatformTransactionManager transactionManager;
        private final KafkaAwareTransactionManager kafkaTxManager;
        private final TransactionTemplate transactionTemplate;
        private final String consumerGroupId;
        private final TaskScheduler taskScheduler;
        private final ScheduledFuture monitorTask;
        private final LogIfLevelEnabled commitLogger;
        private final Duration pollTimeout;
        private volatile Map definedPartitions;
        private volatile Collection assignedPartitions;
        private volatile Thread consumerThread;
        private int count;
        private long last;
        private boolean fatalError;
        private boolean taskSchedulerExplicitlySet;
        private boolean consumerPaused;
        private volatile long lastPoll;
        //重平衡监听器 创建rebalanceListener方法在下面
        ConsumerRebalanceListener rebalanceListener = this.createRebalanceListener(consumer);
        
        //构造函数
        ListenerConsumer(GenericMessageListener listener, ListenerType listenerType){
            //... init some field
        
        //未指定消费某个主题的某分区
        if (KafkaMessageListenerContainer.this.topicPartitions == null) {
                if (this.containerProperties.getTopicPattern() != null) {
                   //设置了Topic pattern,订阅those topic
                   
                   consumer.subscribe(this.containerProperties.getTopicPattern(), rebalanceListener);
                } else {
                   //订阅主题
                   
                   consumer.subscribe(Arrays.asList(this.containerProperties.getTopics()), rebalanceListener);
                }
                //指定消费某个主题的某分区
            } else {
                List topicPartitions = Arrays.asList(KafkaMessageListenerContainer.this.topicPartitions);
                this.definedPartitions = new HashMap(topicPartitions.size());
                Iterator var7 = topicPartitions.iterator();

                while(var7.hasNext()) {
                    TopicPartitionInitialOffset topicPartition = (TopicPartitionInitialOffset)var7.next();
                   
                   //topicPartition+offseMetadata
                   
                   this.definedPartitions.put(topicPartition.topicPartition(), new KafkaMessageListenerContainer.OffsetMetadata(topicPartition.initialOffset(), topicPartition.isRelativeToCurrent(), topicPartition.getPosition()));
                }
            
                consumer.assign(new ArrayList(this.definedPartitions.keySet()));
            }
        }
        
         public ConsumerRebalanceListener createRebalanceListener(final Consumer consumer) {
               return new ConsumerRebalanceListener() {
                final ConsumerRebalanceListener userListener = KafkaMessageListenerContainer.this.getContainerProperties().getConsumerRebalanceListener();
                final ConsumerAwareRebalanceListener consumerAwareListener;

                {
                    this.consumerAwareListener = this.userListener instanceof ConsumerAwareRebalanceListener ? (ConsumerAwareRebalanceListener)this.userListener : null;
                }
                
                public void onPartitionsAssigned(Collection partitions){
                
                if (ListenerConsumer.this.consumerPaused) {
                        ListenerConsumer.this.consumerPaused = false;
                        ListenerConsumer.this.logger.warn("Paused consumer resumed by Kafka due to rebalance; the container will pause again before polling, unless the container's 'paused' property is reset by a custom rebalance listener");
                    }

                    ListenerConsumer.this.assignedPartitions = partitions;
                    
                    if (!ListenerConsumer.this.autoCommit) {
                        final Map offsets = new HashMap();
                        Iterator var3 = partitions.iterator();

                        while(var3.hasNext()) {
                            TopicPartition partition = (TopicPartition)var3.next();

                            try {
                                //Map
                                offsets.put(partition, new OffsetAndMetadata(consumer.position(partition)));
                            } catch (NoOffsetForPartitionException var6) {
                                ListenerConsumer.this.fatalError = true;
                                ListenerConsumer.this.logger.error("No offset and no reset policy", var6);
                                return;
                            }
                        }

                        ListenerConsumer.this.commitLogger.log(() -> {
                            return "Committing on assignment: " + offsets;
                        });
                        if (ListenerConsumer.this.transactionTemplate != null && ListenerConsumer.this.kafkaTxManager != null) {
                            ListenerConsumer.this.transactionTemplate.execute(new TransactionCallbackWithoutResult() {
                                protected void doInTransactionWithoutResult(TransactionStatus status) {
                                    ((KafkaResourceHolder)TransactionSynchronizationManager.getResource(ListenerConsumer.this.kafkaTxManager.getProducerFactory())).getProducer().sendOffsetsToTransaction(offsets, ListenerConsumer.this.consumerGroupId);
                                }
                            });
                        } else if (KafkaMessageListenerContainer.this.getContainerProperties().isSyncCommits()) {
                           //同步提交
                           ListenerConsumer.this.consumer.commitSync(offsets);
                        } else {
                           //异步提交
                           ListenerConsumer.this.consumer.commitAsync(offsets, KafkaMessageListenerContainer.this.getContainerProperties().getCommitCallback());
                        }
                    }
                }
         }
         }
         
public void run() {
    //如果运行中并且定义了分区,就没必要重平衡分配分区了
    if (isRunning() && this.definedPartitions != null) { 
    //有需要就初始化分区
      initPartitionsIfNeeded();  
    }
    
    if (!this.autoCommit&& !this.isRecordAck) {
            //手动提交处理
            processCommits();     
    }
    //重新定位偏移量,下一次消费时使用
     processSeeks();
     
     //上述操作是进行本次消费完之后的commit、seek操作
     
     
     
     //开始下一次的poll
     //kafka clients的consumer,poll方法内部... 之后的逻辑都是apache kafka的几个类在进行相关操作
     //ConsumerRecordss implements Iterable  会在下面转化成List,K是用户指定的message key,V是message value
     ConsumerRecords records = this.consumer.poll(this.pollTimeout);
     
     if (records != null && records.count() > 0) {
						if (this.containerProperties.getIdleEventInterval() != null) {
							lastReceive = System.currentTimeMillis();
						}
						invokeListener(records);//调用
					}
					
	//Clear the group id for the consumer bound to this thread.
	ProducerFactoryUtils.clearConsumerGroupId();
	
	if (records != null && records.count() > 0) {
						if (this.containerProperties.getIdleEventInterval() != null) {
							lastReceive = System.currentTimeMillis();
						}
						//在当前线程执行消费
						invokeListener(records);
					}
					
	private void invokeListener(final ConsumerRecords records) {
			if (this.isBatchListener) {
				invokeBatchListener(records);
			}
			else {
				invokeRecordListener(records);
			}
		}	
}

        private void processCommits() {
            this.count += this.acks.size();
			handleAcks();
			long now;
			AckMode ackMode = this.containerProperties.getAckMode();
			if (!this.isManualImmediateAck) {
			    //...没用这种
			}
        }
        
        /**
		 * Process any acks that have been queued.
		 */
		private void handleAcks() {
			ConsumerRecord record = this.acks.poll();
			while (record != null) {
				if (this.logger.isTraceEnabled()) {
					this.logger.trace("Ack: " + record);
				}
				processAck(record);
				record = this.acks.poll();
			}
		}
		
		
		private void processAck(ConsumerRecord record) {
			if (!Thread.currentThread().equals(this.consumerThread)) {
				try {
					this.acks.put(record);
				}
				catch (InterruptedException e) {
					Thread.currentThread().interrupt();
					throw new KafkaException("Interrupted while storing ack", e);
				}
			}
			else {
				if (this.isManualImmediateAck) {
					try {
						ackImmediate(record);
					}
					catch (WakeupException e) {
						// ignore - not polling
					}
				}
				else {
					addOffset(record);
				}
			}
		}
		
		//提交ack的终极核心方法
		private void ackImmediate(ConsumerRecord record) {
			Map commits = Collections.singletonMap(
					new TopicPartition(record.topic(), record.partition()),
					new OffsetAndMetadata(record.offset() + 1));
			this.commitLogger.log(() -> "Committing: " + commits);
			//默认是这种
			if (this.containerProperties.isSyncCommits()) {
				this.consumer.commitSync(commits);
			}
			else {
				this.consumer.commitAsync(commits, this.commitCallback);
			}
		}
		
		
		
		//重新定位偏移量,以便下一次消费
		private void processSeeks() {
		    //BlockingQueue seeks
			TopicPartitionInitialOffset offset = this.seeks.poll();
			//没设置这种就是null
			while (offset != null) {
				if (this.logger.isTraceEnabled()) {
					this.logger.trace("Seek: " + offset);
				}
				try {
					SeekPosition position = offset.getPosition();
					if (position == null) {
						this.consumer.seek(offset.topicPartition(), offset.initialOffset());
					}
					else if (position.equals(SeekPosition.BEGINNING)) {
						this.consumer.seekToBeginning(Collections.singletonList(offset.topicPartition()));
					}
					else {
						this.consumer.seekToEnd(Collections.singletonList(offset.topicPartition()));
					}
				}
				catch (Exception e) {
					this.logger.error("Exception while seeking " + offset, e);
				}
				offset = this.seeks.poll();
			}
		}


					
        private void invokeListener(final ConsumerRecords records) {
        	if (this.isBatchListener) {
				invokeBatchListener(records);
	        }else {
				invokeRecordListener(records);
	        }
        }
		
			private void invokeBatchListener(final ConsumerRecords records) {
			List> recordList = null;
			if (!this.wantsFullRecords) {
			    //逻辑就是把records转化成List>
				recordList = createRecordList(records);
			}
			if (this.wantsFullRecords || recordList.size() > 0) {
				if (this.transactionTemplate != null) {
					invokeBatchListenerInTx(records, recordList);
				}
				else {
					doInvokeBatchListener(records, recordList, null);
				}
			}
		}
		
		 终极核心方法  执行的就是我们定义的MessageListener借口的onMessage()
        private RuntimeException doInvokeBatchListener(ConsumerRecords records, List> recordList, Producer producer) throws Error {
            //onmessage
            try {
                if (this.wantsFullRecords) {
                    //consumer--->private final org.apache.kafka.clients.consumer.Consumer consumer;
                    //终极核心方法
                    this.batchListener.onMessage(records, this.isAnyManualAck ? new KafkaMessageListenerContainer.ListenerConsumer.ConsumerBatchAcknowledgment(records) : null, this.consumer);
                } else {
                    switch(this.listenerType) {
                    case ACKNOWLEDGING_CONSUMER_AWARE:
                    //终极核心方法
                        this.batchListener.onMessage(recordList, this.isAnyManualAck ? new KafkaMessageListenerContainer.ListenerConsumer.ConsumerBatchAcknowledgment(records) : null, this.consumer);
                        break;
                        //终极核心方法
                    //设置的MANUAL_IMMEDIATE这种模式
                    //调用的就是GenericMessageListener 的onMessage(List> data, Acknowledgment acknowledgment)
                    case ACKNOWLEDGING:
                        this.batchListener.onMessage(recordList, this.isAnyManualAck ? new KafkaMessageListenerContainer.ListenerConsumer.ConsumerBatchAcknowledgment(records) : null);
                        break;
                    case CONSUMER_AWARE:
                        this.batchListener.onMessage(recordList, this.consumer);
                        break;
                    case SIMPLE:
                        this.batchListener.onMessage(recordList);
                    }
                }

                if (!this.isAnyManualAck && !this.autoCommit) {
                    Iterator var11 = this.getHighestOffsetRecords(records).iterator();

                    while(var11.hasNext()) {
                        ConsumerRecord record = (ConsumerRecord)var11.next();
                        this.acks.put(record);
                    }

                    if (producer != null) {
                        this.sendOffsetsToTransaction(producer);
                    }
                }
            } catch (RuntimeException var9) {
                RuntimeException e = var9;
                Iterator var5;
                ConsumerRecord recordx;
                if (this.containerProperties.isAckOnError() && !this.autoCommit && producer == null) {
                    var5 = this.getHighestOffsetRecords(records).iterator();

                    while(var5.hasNext()) {
                        recordx = (ConsumerRecord)var5.next();
                        this.acks.add(recordx);
                    }
                }

                if (this.batchErrorHandler == null) {
                    throw var9;
                }

                try {
                    if (this.batchErrorHandler instanceof ContainerAwareBatchErrorHandler) {
                        ((ContainerAwareBatchErrorHandler)this.batchErrorHandler).handle(e, records, this.consumer, KafkaMessageListenerContainer.this.container);
                    } else {
                        this.batchErrorHandler.handle(e, records, this.consumer);
                    }

                    if (producer != null) {
                        var5 = this.getHighestOffsetRecords(records).iterator();

                        while(var5.hasNext()) {
                            recordx = (ConsumerRecord)var5.next();
                            this.acks.add(recordx);
                        }

                        this.sendOffsetsToTransaction(producer);
                    }
                } catch (RuntimeException var7) {
                    this.logger.error("Error handler threw an exception", var7);
                    return var7;
                } catch (Error var8) {
                    this.logger.error("Error handler threw an error", var8);
                    throw var8;
                }
            } catch (InterruptedException var10) {
                Thread.currentThread().interrupt();
            }

            return null;
        }

        

⑥ KafkaConsumer消费流程

1. org.apache.kafka.clients.consumer.KafkaConsumer分析

重要属性
    private final String clientId;
    private final ConsumerCoordinator coordinator;
    private final Deserializer keyDeserializer;
    private final Deserializer valueDeserializer;
    private final Fetcher fetcher;
    private final ConsumerInterceptors interceptors;
    private final Time time;
    private final ConsumerNetworkClient client;
    private final SubscriptionState subscriptions;
    private final Metadata metadata;
    private List assignors;
    //org.apache.kafka.clients.consumer.RangeAssignor(默认是这种)或者RoundRobinAssignor两种实现,也可以自定义分区分配的实现
    //
    
    private KafkaConsumer(ConsumerConfig config, Deserializer keyDeserializer, Deserializer valueDeserializer) {
        //...
        
        //分区分配的实现
        this.assignors = config.getConfiguredInstances("partition.assignment.strategy", PartitionAssignor.class);
        
        this.client = new ConsumerNetworkClient(logContext, netClient, this.metadata, this.time, this.retryBackoffMs, config.getInt("request.timeout.ms"), heartbeatIntervalMs);
        
        this.fetcher = new Fetcher(logContext, this.client, config.getInt("fetch.min.bytes"), config.getInt("fetch.max.bytes"), config.getInt("fetch.max.wait.ms"), config.getInt("max.partition.fetch.bytes"), config.getInt("max.poll.records"), config.getBoolean("check.crcs"), this.keyDeserializer, this.valueDeserializer, this.metadata, this.subscriptions, this.metrics, metricsRegistry.fetcherMetrics, this.time, this.retryBackoffMs, this.requestTimeoutMs, isolationLevel);
    }
    //请求拉取Records
    private ConsumerRecords poll(long timeoutMs, boolean includeMetadataInTimeout) {
        //...
        Map>> records = this.pollForFetches(this.remainingTimeAtLeastZero(timeoutMs, elapsedTime));
        
    }
    
    private Map>> pollForFetches(long timeoutMs) {
        long startMs = this.time.milliseconds();
        long pollTimeout = Math.min(this.coordinator.timeToNextPoll(startMs), timeoutMs);
        Map>> records = this.fetcher.fetchedRecords();
        if (!records.isEmpty()) {
            return records;
        } else {
            //Feacher
            this.fetcher.sendFetches();
            if (!this.cachedSubscriptionHashAllFetchPositions && pollTimeout > this.retryBackoffMs) {
                pollTimeout = this.retryBackoffMs;
            }

            //ConsumerNetworkClient
            this.client.poll(pollTimeout, startMs, () -> {
                return !this.fetcher.hasCompletedFetches();
            });
            return this.coordinator.rejoinNeededOrPending() ? Collections.emptyMap() : this.fetcher.fetchedRecords();
        }
    }
    
    
    
    
    
 2. org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient分析   
    
重要属性

    //图中的NetworkClient implements KafkaClient
    private final KafkaClient client;
    //图中的unsent
    private final ConsumerNetworkClient.UnsentRequests unsent = new ConsumerNetworkClient.UnsentRequests();
    private final Metadata metadata;
    private final Time time;
    private final long retryBackoffMs;
    private final int maxPollTimeoutMs;
    private final int requestTimeoutMs;
    
构造函数
    public ConsumerNetworkClient(LogContext logContext, KafkaClient client, Metadata metadata, Time time, long retryBackoffMs, int requestTimeoutMs, int maxPollTimeoutMs) {
        
    }

重要方法
    poll(long timeout, long now, ConsumerNetworkClient.PollCondition pollCondition, boolean disableWakeup){
        {
        this.firePendingCompletedRequests();
        this.lock.lock();

        try {
            this.handlePendingDisconnects();
            //调用trySend()--->client.send
            long pollDelayMs = this.trySend(now);
            timeout = Math.min(timeout, pollDelayMs);
            if (this.pendingCompletion.isEmpty() && (pollCondition == null || pollCondition.shouldBlock())) {
                if (this.client.inFlightRequestCount() == 0) {
                    timeout = Math.min(timeout, this.retryBackoffMs);
                }
                
                this.client.poll(Math.min((long)this.maxPollTimeoutMs, timeout), now);
                now = this.time.milliseconds();
            } else {
                this.client.poll(0L, now);
            }

            this.checkDisconnects(now);
            if (!disableWakeup) {
                this.maybeTriggerWakeup();
            }

            this.maybeThrowInterruptException();
            this.trySend(now);
            this.failExpiredRequests(now);
            this.unsent.clean();
        } finally {
            this.lock.unlock();
        }

        this.firePendingCompletedRequests();
    }
        
    }
    
    private long trySend(long now) {
        long pollDelayMs = 9223372036854775807L;
        Iterator var5 = this.unsent.nodes().iterator();
        
        //图中的:从unset中取出一个个Request
        while(var5.hasNext()) {
            Node node = (Node)var5.next();
            Iterator iterator = this.unsent.requestIterator(node);
            if (iterator.hasNext()) {
                pollDelayMs = Math.min(pollDelayMs, this.client.pollDelayMs(node, now));
            }

            while(iterator.hasNext()) {
                ClientRequest request = (ClientRequest)iterator.next();
                if (this.client.ready(node, now)) {
                    //图中调用NetworkClient的send方法,之后会将ClientRequest转换成RequestSend
                    this.client.send(request, now);
                    iterator.remove();
                }
            }
        }

        return pollDelayMs;
    }
    
    public RequestFuture send(Node node, Builder requestBuilder, int requestTimeoutMs) {
        long now = this.time.milliseconds();
        ConsumerNetworkClient.RequestFutureCompletionHandler completionHandler = new ConsumerNetworkClient.RequestFutureCompletionHandler();
        ClientRequest clientRequest = this.client.newClientRequest(node.idString(), requestBuilder, now, true, requestTimeoutMs, completionHandler);
        //clientRequest缓存到unsent中
        this.unsent.put(node, clientRequest);
        this.client.wakeup();
        return completionHandler.future;
    }
    
    
3. org.apache.kafka.clients.consumer.internals.Fetcher分析
重要属性
    //Feacher里的ConsumerNetworkClient去执行
    private final ConsumerNetworkClient client;
    private final Time time;
    private final int minBytes;
    private final int maxBytes;
    private final int maxWaitMs;
    private final int fetchSize;
    private final long retryBackoffMs;
    private final long requestTimeoutMs;
    private final int maxPollRecords;
    private final Metadata metadata;

构造函数
    public Fetcher(LogContext logContext, ConsumerNetworkClient client, int minBytes, int maxBytes, int maxWaitMs, int fetchSize, int maxPollRecords, boolean checkCrcs, Deserializer keyDeserializer, Deserializer valueDeserializer, Metadata metadata, SubscriptionState subscriptions, Metrics metrics, FetcherMetricsRegistry metricsRegistry, Time time, long retryBackoffMs, long requestTimeoutMs, IsolationLevel isolationLevel) {
        
    }

重要方法

    public Map>> fetchedRecords() {
        Map>> fetched = new HashMap();
        int recordsRemaining = this.maxPollRecords;

        try {
            while(recordsRemaining > 0) {
                if (this.nextInLineRecords != null && !this.nextInLineRecords.isFetched) {
                    List> records = this.fetchRecords(this.nextInLineRecords, recordsRemaining);
                    TopicPartition partition = this.nextInLineRecords.partition;
                    if (!records.isEmpty()) {
                        List> currentRecords = (List)fetched.get(partition);
                        if (currentRecords == null) {
                            fetched.put(partition, records);
                        } else {
                            List> newRecords = new ArrayList(records.size() + currentRecords.size());
                            newRecords.addAll(currentRecords);
                            newRecords.addAll(records);
                            fetched.put(partition, newRecords);
                        }

                        recordsRemaining -= records.size();
                    }
                } else {
                    Fetcher.CompletedFetch completedFetch = (Fetcher.CompletedFetch)this.completedFetches.peek();
                    if (completedFetch == null) {
                        break;
                    }

                    try {
                        this.nextInLineRecords = this.parseCompletedFetch(completedFetch);
                    } catch (Exception var7) {
                        PartitionData partition = completedFetch.partitionData;
                        if (fetched.isEmpty() && (partition.records == null || partition.records.sizeInBytes() == 0)) {
                            this.completedFetches.poll();
                        }

                        throw var7;
                    }

                    this.completedFetches.poll();
                }
            }
        } catch (KafkaException var8) {
            if (fetched.isEmpty()) {
                throw var8;
            }
        }

        return fetched;
    }
    

    public int sendFetches() {
        Map fetchRequestMap = this.prepareFetchRequests();
        final Node fetchTarget;
        final FetchRequestData data;
        Builder request;
        //此处是图中执行缓存request到unsent的步骤
         for(Iterator var2 = fetchRequestMap.entrySet().iterator(); var2.hasNext();
         this.client.send(fetchTarget, request).addListener(new RequestFutureListener() {
             
         }
    }
    
    
org.apache.kafka.clients.NetworkClient分析

重要属性

    private final Selectable selector;
    private final MetadataUpdater metadataUpdater;
    private final Random randOffset;
    private final ClusterConnectionStates connectionStates;
    private final InFlightRequests inFlightRequests;
    private final int socketSendBuffer;
    private final int socketReceiveBuffer;
    private final String clientId;
    private int correlation;
    private final int defaultRequestTimeoutMs;
    private final long reconnectBackoffMs;
    private final Time time;
    private final boolean discoverBrokerVersions;
    private final ApiVersions apiVersions;
    private final Map nodesNeedingApiVersionsFetch;
    private final List abortedSends;
    private final Sensor throttleTimeSensor;
    

重要方法

    //将ClientRequest转换成RequestSend
    public void send(ClientRequest request, long now) {
        this.doSend(request, false, now);
    }
    
    //doSend
    private void doSend(ClientRequest clientRequest, boolean isInternalRequest, long now, AbstractRequest request) {
        String destination = clientRequest.destination();
        RequestHeader header = clientRequest.makeHeader(request.version());
        if (this.log.isDebugEnabled()) {
            int latestClientVersion = clientRequest.apiKey().latestVersion();
            if (header.apiVersion() == latestClientVersion) {
                this.log.trace("Sending {} {} with correlation id {} to node {}", new Object[]{clientRequest.apiKey(), request, clientRequest.correlationId(), destination});
            } else {
                this.log.debug("Using older server API v{} to send {} {} with correlation id {} to node {}", new Object[]{header.apiVersion(), clientRequest.apiKey(), request, clientRequest.correlationId(), destination});
            }
        }

        Send send = request.toSend(destination, header);
        NetworkClient.InFlightRequest inFlightRequest = new NetworkClient.InFlightRequest(clientRequest, header, isInternalRequest, request, send, now);
        this.inFlightRequests.add(inFlightRequest);
        //塞到Selector里面
        this.selector.send(send);
    }
    

org.apache.kafka.common.network.Selector类

重要方法

public void send(Send send) {
        String connectionId = send.destination();
        //kafkaChannel的
        KafkaChannel channel = this.openOrClosingChannelOrFail(connectionId);
        if (this.closingChannels.containsKey(connectionId)) {
            //关通道
            this.failedSends.add(connectionId);
        } else {
            try {
                channel.setSend(send);
            } catch (Exception var5) {
                channel.state(ChannelState.FAILED_SEND);
                this.failedSends.add(connectionId);
                this.close(channel, Selector.CloseMode.DISCARD_NO_NOTIFY);
                if (!(var5 instanceof CancelledKeyException)) {
                    this.log.error("Unexpected exception during send, closing connection {} and rethrowing exception {}", connectionId, var5);
                    throw var5;
                }
            }
        }

    }    
    
 
org.apache.kafka.common.network.KafkaChannel{
    private final TransportLayer transportLayer;//传输层
}
    
    
    
    public interface TransportLayer extends ScatteringByteChannel{ GatheringByteChannel
    //TransportLayer里的方法,调用的就是java NIO的transferFrom 跟NIO的transferTo一样,底层也是走的linux的sendFile(),数据传输方式就是0拷贝
    long transferFrom(FileChannel var1, long var2, long var4)
        
    }
    
    
    
  1. KafkaMessageListenerContainer & ConcurrentMessageListenerContainer的对比
  • spring配置区别
    
        
        
    

    
        
        
        
    

Consumer Configs

NAME DESCRIPTION TYPE DEFAULT
bootstrap.servers 用于建立到Kafka集群的初始连接的主机/端口对列表 list
key.deserializer 实现org.apache.kafka.common.serialization.Deserializer接口的密钥的反序列化器类。 class
value.deserializer 用于实现org.apache.kafka.common.serialization.Deserializer接口的值的反序列化器类。 class
fetch.min.bytes 服务器为获取请求返回的最小数据量,默认值设置为1的目的是:使得consumer的请求能够尽快的返回。 int 1
group.id 标识此消费者所属的消费者群组的唯一字符串 string “”
heartbeat.interval.ms 心跳间隔 int 3000
max.partition.fetch.bytes 一次fetch请求,从一个partition中取得的records最大大小 int 1048576
session.timeout.ms 使用Kafka的组管理设施时,用于检测消费者失败的超时 int 10000
max.poll.interval.ms 两次的poll间隔 int 300000(自己设置时必须必上面的sessionTimeout大)
ssl.key.password 密钥存储文件中的私钥密码 password null
ssl.keystore.location 密钥存储文件的位置 string null
ssl.keystore.password 密钥存储文件的商店密码 password null
ssl.truststore.location 信任存储文件的位置 string null
ssl.truststore.password 信任存储文件的密码 password null
auto.offset.reset 这个配置项,是告诉Kafka Broker在发现kafka在没有初始offset/当前的offset是一个不存在的值,该如何处理(earliest、latest、none、anything else 4种处理方式) string latest
connections.max.idle.ms 连接空闲超时时间 long 540000
enable.auto.commit 自动提交\手动提交 boolean true(默认自动提交)
exclude.internal.topics 内部主题(如__consumer_offsets)的记录是否应该暴露给消费者 boolean true
fetch.max.bytes 一次fetch请求,从一个broker中取得的records最大大小 int 52428800
isolation.level 控制如何阅读事务处理的消息 string read_uncommitted
request.timeout.ms 配置控制客户端等待请求响应的最长时间 int 305000
partition.assignment.strategy 分区分配的策略,默认是RangeAssignor(只有在分区数是消费者数量的整数倍时能均匀分配),另一种RoundRobin(均匀) class class org.apache.kafka.clients.consumer.RangeAssignor
send.buffer.bytes 发送数据时要使用的TCP发送缓冲区(SO_SNDBUF)的大小 int 131072
receive.buffer.bytes 读取数据时使用的TCP接收缓冲区(SO_RCVBUF)的大小 int 65536

等等…详细kafkaConsumer配置

kafkaProducer

kafkaProducer生产者模型

Spring-Kafka源码解析_第2张图片
由KafkaTemplete发起发送请求,可分为如下几个步骤:

  1. 数据入池
  • KafkaProducer启动发送消息
  • 消息发送拦截器拦截
  • 用序列化器把数据进行序列化
  • 用分区器选择消息的分区
  • 添加进记录累加器
  1. NIO发送数据
  • 等待数据条数达到批量发送阀值或者新建一个RecoedBatch,立即唤醒Sender线程执行run方法
  • 发送器内部从累加器Deque中拿到要发送的数据RecordBatch转换成ClientRequest客户端请求
  • 在发送器内部,经由NetworkClient转换成RequestSend(Send接口)并调用Selector暂存进KafkaChannel(NetWorkClient维护的通道Map channels)
  • 执行nio发送消息(1.Selector.select()2.把KafkaChannel中的Send数据(ByteBuffer[])写入KafkaChannel的写通道GatheringByteChannel)

Producer Configs

NAME DESCRIPTION TYPE DEFAULT
bootstrap.servers host/port列表,用于初始化建立和Kafka集群的连接 list
key.serializer key的序列化类(实现序列化接口) class
value.serializer value的序列化类(实现序列化接口) class
acks 生产者需要leader确认请求完成之前接收的应答数(-1/0/1) string 默认1,建议设置成-1或者all
buffer.memory 生产者用来缓存等待发送到服务器的消息的内存总字节数。 long 33554432
compression.type 数据压缩的类型。默认为空(就是不压缩) string none
retries 设置一个比零大的值,客户端如果发送失败则会重新发送 int 0
connections.max.idle.ms 多少毫秒之后关闭闲置的连接。 long 540000
client.id 当发出请求时传递给服务器的id字符串 string “”
batch.size 当多个消息要发送到相同分区的时,生产者尝试将消息批量打包在一起,以减少请求交互。 int 16384
compression.type 消息压缩算法,使用压缩可以降低网络传输开销和存储开销,这也往往是向kafka发送消息的瓶颈所在 string none,建议设置成snappy(google发明的)消耗较低的CPU并且有较为理想的压缩比和压缩性能
批注:上述有几个参数需要配合使用
request.required.acks=-1或者all---ISR队列中的都同步了才能ack
min.insync.replicas=2---控制ISR中的rep数量
unclean.leader.election.enable=true---kafka默认的leader选举策略

unclean.leader.election.enable=false 
等待ISR中任意一个replica“活”过来,并且选它作为leader
unclean.leader.election.enable=true 
选择第一个“活”过来的replica(并不一定是在ISR中)作为leader

等等… 详细kafkaProducer配置

使用过程中踩的坑

坑1

Spring-Kafka源码解析_第3张图片
Spring-Kafka源码解析_第4张图片

  1. 问题:从上述日志看出,kafka消费者出现异常rebanlance,平均几秒钟就会rebanlance一次,比较严重的问题
  2. 问题说明(日志大概意思:消费者在poll一批数据后,处理时间过长了。导致下一次poll时间到来时并不能按时提交offset给broker(其中一台broker是该消费者组的Coordinator),此时broker会认为该消费者已经宕机,从而马上进行rebalance)
  3. 问题分析
  • 两次poll间设置的时间间隔太短(第一次poll的数据还没处理完,就会造成下一次poll时间到了都还没有提交offset,broker就会认为此消费者宕机了,就会触发rebalance,而且还会造成重复消费的问题)
  1. 解决方案(参数)
  • 增加poll的处理时长:max.poll.interval.ms 它的默认值是300000ms 此参数必须设置比SESSION_TIMEOUT_MS_CONFIG大(切记 切记 切记
  • 设置分区拉取阈值:max.poll.records 它的默认值是500条,你根据实际情况调
  • 其他的:可以将消费者和broker的session时间调长一点,就是调节session.timeout.ms(默认10秒)这个属性,
    一般这个属性和heartbeat.interval.ms(默认3秒)这个属性(一般设置为前者的1/3)一起调节
  1. 解决方案(代码)
//外部循环不断调用poll方法拉取数据
while (isRunning) {
            ConsumerRecords records = consumer.poll(300);//拉取数据
            if (null!=records && records.count() > 0) {
           //内部循环轮询处理消息
            for (ConsumerRecord record : records) {
                delMessage(...)//处理逻辑
                try {
                    //records记录全部完成后,才提交
                      consumer.commitSync();
                } catch (CommitFailedException e) {
                      logger.error("commit offset failed,will break this forLoop", e);
                        break;//rebalance后会继续从当前offset进行消费,从而避免重复消费
                }
            }
}

坑2

  1. 问题:
    kafka客户端—消费者提交offset时报错
ERROR{ConsumerCoordinator.java:833}-[Consumer clientId=consumer-1, groupId=default-group] Offset commit failed on partition smzc_mysql_data_sync-1 at offset 14763556: The request timed out.
  1. 问题分析:这个错会经常发生在负载较高的partition,而负载较低的则不会,实际上并不影响数据处理,因为当他超时抛错之后之后会进行重试,使用者稍后会重新读取消息并成功处理它。
  • 手动提交ACK=MANUAL_IMMEDIATE模式下,kafkaConsumer提交offset默认采用的是SyncCommit(同步提交—同步commit情况下,这种情况可靠性最高),因为失败情况下,会进行重试
  • 注意:一定要保证提交offset成功,否则,有可能造成重复消费
    同步批量提交offset造成重复消费的情况:消费某分区的数据—>消费了一部分(但是还没提交offset)—>发生重平衡(消费者挂了,topic增减了分区数量etc),此部分数据会重复消费
  1. 解决方案:
  • request.timeout.ms (by default 305000ms)-----可以调大一点
    The configuration controls the maximum amount of time the client will wait for the response of a request

  • connections.max.idle.ms (by default 540000ms)
    Close idle connections after the number of milliseconds specified by this config.

  • offsets.channel.socket.timeout.ms(by default 10000)—可以调大一点

你可能感兴趣的:(大数据,Spring,kafka,kafka)