kakfa 3.0 创建topic流程(源码)

文章目录

  • 1、通过create命令到组装创建topic需要的数据流程(scala部分)
  • 2、创建一个客户端,此客户端通过队列多线程异步发送创建topic的请求
    • (1)runnable.call(队列和多线程执行)
    • (2)getCreateTopicsCall(创建发送创建topic的requestBuilder)
  • 3、服务端创建topic的请求(handleCreateTopicsRequest)
    • (1)这里先看一下kafka集群启动时的操作
    • (2)在初始化KafkaApis时如何选择是zk的还是raft的
    • (3)、这里卡住了,

1、通过create命令到组装创建topic需要的数据流程(scala部分)

首先创建kafka topic的命令是下面这个

bin/kafka-topics.sh --bootstrap-server broker_host:port --create --topic my_topic_name \--partitions 20 --replication-factor 3 --config x=y

--bootstrap-server 某一台kafka服务器地址和端口
--create 代表这个命令是创建
--topic 后面是想创建的topic
partitions 主动设置分区数
--replication-factor 主动设置一个分区数中有几个副本
--config x=y 在命令行上添加的配置会覆盖服务器的默认设置,例如数据应该保留的时间长度。此处记录了完整的每个主题配置集。选填

之后再看kafka-topics.sh 里面的命令

exec $(dirname $0)/kafka-run-class.sh kafka.admin.TopicCommand "$@"

知道了其实是执行了源码core/src/main/scala/kafka/admin/TopicCommand.scala文件中的方法
这里需要注意的是从kafka 2.8以后,删除了ZooKeeper,通过KRaft进行自己的集群管理,所以下面源码中没有ZookeeperTopicService 这个创建topic的方法了


object TopicCommand extends Logging {

  def main(args: Array[String]): Unit = {
    val opts = new TopicCommandOptions(args)
    opts.checkArgs()
	//初始化得到实例化的topicService
    val topicService = TopicService(opts.commandConfig, opts.bootstrapServer)

    var exitCode = 0
    try {
      if (opts.hasCreateOption)
      	//这个是通过判断命令中的是否是--create 关键字来判断是否执行createTopic
        topicService.createTopic(opts)
      else if (opts.hasAlterOption)
        topicService.alterTopic(opts)
      else if (opts.hasListOption)
        topicService.listTopics(opts)
      else if (opts.hasDescribeOption)
        topicService.describeTopic(opts)
      else if (opts.hasDeleteOption)
        topicService.deleteTopic(opts)
    } catch {
      case e: ExecutionException =>
        if (e.getCause != null)
          printException(e.getCause)
        else
          printException(e)
        exitCode = 1
      case e: Throwable =>
        printException(e)
        exitCode = 1
    } finally {
      topicService.close()
      Exit.exit(exitCode)
    }
  }

TopicService(opts.commandConfig, opts.bootstrapServer) 执行的是下面的方法中的apply

 object TopicService {
    def createAdminClient(commandConfig: Properties, bootstrapServer: Option[String]): Admin = {
      bootstrapServer match {
        case Some(serverList) => commandConfig.put(CommonClientConfigs.BOOTSTRAP_SERVERS_CONFIG, serverList)
        case None =>
      }
      Admin.create(commandConfig)
    }

    def apply(commandConfig: Properties, bootstrapServer: Option[String]): TopicService =
      new TopicService(createAdminClient(commandConfig, bootstrapServer))
  }

之后又调用的createAdminClient创建的一个客户端,来创建topic

下面就是验证参数,是否指定参数设置等等,之后调用新创建的clien创建topic

case class TopicService private (adminClient: Admin) extends AutoCloseable {

    def createTopic(opts: TopicCommandOptions): Unit = {
      //创建一个topic,把输入参数,比如分区数,副本数等等参数设置上
      val topic = new CommandTopicPartition(opts)
      if (Topic.hasCollisionChars(topic.name)) //检查topic名称中的特殊字符
        println("WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could " +
          "collide. To avoid issues it is best to use either, but not both.")
      createTopic(topic)
    }

    def createTopic(topic: CommandTopicPartition): Unit = {
      // //如果配置了副本副本数--replication-factor 一定要大于0
      if (topic.replicationFactor.exists(rf => rf > Short.MaxValue || rf < 1))
        throw new IllegalArgumentException(s"The replication factor must be between 1 and ${Short.MaxValue} inclusive")
      //如果配置了--partitions 分区数 必须大于0
      if (topic.partitions.exists(partitions => partitions < 1))
        throw new IllegalArgumentException(s"The partitions must be greater than 0")

      try {
        val newTopic = if (topic.hasReplicaAssignment)
          // 如果指定了--replica-assignment参数;则按照指定的来分配副本
          new NewTopic(topic.name, asJavaReplicaReassignment(topic.replicaAssignment.get))
        else {
          new NewTopic(
            topic.name,
            topic.partitions.asJava,
            topic.replicationFactor.map(_.toShort).map(Short.box).asJava)
        }
        //将配置--config 解析成一个配置map
        val configsMap = topic.configsToAdd.stringPropertyNames()
          .asScala
          .map(name => name -> topic.configsToAdd.getProperty(name))
          .toMap.asJava

        newTopic.configs(configsMap)
        //调用adminClient创建Topic
        val createResult = adminClient.createTopics(Collections.singleton(newTopic),
          new CreateTopicsOptions().retryOnQuotaViolation(false))
        createResult.all().get()
        println(s"Created topic ${topic.name}.")
      } catch {
        case e : ExecutionException =>
          if (e.getCause == null)
            throw e
          if (!(e.getCause.isInstanceOf[TopicExistsException] && topic.ifTopicDoesntExist()))
            throw e.getCause
      }
    }

2、创建一个客户端,此客户端通过队列多线程异步发送创建topic的请求

KafkaAdminClient.java 中的createTopics方法

  @Override
    public CreateTopicsResult createTopics(final Collection<NewTopic> newTopics,
                                           final CreateTopicsOptions options) {
        final Map<String, KafkaFutureImpl<TopicMetadataAndConfig>> topicFutures = new HashMap<>(newTopics.size());
        final CreatableTopicCollection topics = new CreatableTopicCollection();
        //遍历要创建的topic集合
        for (NewTopic newTopic : newTopics) {
            if (topicNameIsUnrepresentable(newTopic.name())) {
                //topic名称不存在
                KafkaFutureImpl<TopicMetadataAndConfig> future = new KafkaFutureImpl<>();
                future.completeExceptionally(new InvalidTopicException("The given topic name '" +
                    newTopic.name() + "' cannot be represented in a request."));
                topicFutures.put(newTopic.name(), future);
            } else if (!topicFutures.containsKey(newTopic.name())) {//防止发一次创建多个topic时有重复的
                topicFutures.put(newTopic.name(), new KafkaFutureImpl<>());
                topics.add(newTopic.convertToCreatableTopic());
            }
        }
        //如果topics不为null。则去创建
        if (!topics.isEmpty()) {
            final long now = time.milliseconds();
            final long deadline = calcDeadlineMs(now, options.timeoutMs());
            //初始化创建topic的调用,
            final Call call = getCreateTopicsCall(options, topicFutures, topics,
                Collections.emptyMap(), now, deadline);
            //这里面才是调用,上面call只是初始化    
            runnable.call(call, now);
        }
        return new CreateTopicsResult(new HashMap<>(topicFutures));
    }

(1)runnable.call(队列和多线程执行)

为什么先讲解这个?而不是先getCreateTopicsCall?因为我觉得先看这个比较好理解,因为它不是单调执行的一步到位,比如先看getCreateTopicsCall会有点迷糊

 /**
         * Initiate a new call.
         *发起新呼叫
         * This will fail if the AdminClient is scheduled to shut down.
         *如果 AdminClient 计划关闭,这将失败
         * @param call      The new call object.
         * @param now       The current time in milliseconds.
         */
        void call(Call call, long now) {
            if (hardShutdownTimeMs.get() != INVALID_SHUTDOWN_TIME) {
                log.debug("The AdminClient is not accepting new calls. Timing out {}.", call);
                call.handleTimeoutFailure(time.milliseconds(),
                    new TimeoutException("The AdminClient thread is not accepting new calls."));
            } else {
                enqueue(call, now);
            }
        }
         /**
         * Queue a call for sending.
         *排队发送呼叫
         * If the AdminClient thread has exited, this will fail. Otherwise, it will succeed (even
         * if the AdminClient is shutting down). This function should called when retrying an
         * existing call.
         *如果 AdminClient 线程已退出,这将失败。否则,它将成功(即使 如果 AdminClient 正在关闭)。
         * 重试现有调用时应调用此函数
         * @param call      The new call object.
         * @param now       The current time in milliseconds.
         */
        void enqueue(Call call, long now) {
            if (call.tries > maxRetries) {
                log.debug("Max retries {} for {} reached", maxRetries, call);
                call.handleTimeoutFailure(time.milliseconds(), new TimeoutException(
                    "Exceeded maxRetries after " + call.tries + " tries."));
                return;
            }
            if (log.isDebugEnabled()) {
                log.debug("Queueing {} with a timeout {} ms from now.", call,
                    Math.min(requestTimeoutMs, call.deadlineMs - now));
            }
            boolean accepted = false;
            //把call放到一个newCalls队列中
            synchronized (this) {
                if (!closing) {
                    newCalls.add(call);
                    accepted = true;
                }
            }
            //唤醒线程去执行
            if (accepted) {
                client.wakeup(); // wake the thread if it is in poll()如果线程处于轮询中,则唤醒线程
            } else {
                log.debug("The AdminClient thread has exited. Timing out {}.", call);
                call.handleTimeoutFailure(time.milliseconds(),
                    new TimeoutException("The AdminClient thread has exited."));
            }
        }

client.wakeup()唤醒的线程执行下面的

  		@Override
        public void run() {
            log.debug("Thread starting");
            try {
            	//这里是处理请求
                processRequests();
            } finally {
                closing = true;
               //省略
                log.debug("Exiting AdminClientRunnable thread.");
            }
        }
   	 private void processRequests() {
            long now = time.milliseconds();
            while (true) {
                // Copy newCalls into pendingCalls.
                //将 newCalls 复制到 pendingCalls
                drainNewCalls();

                // Check if the AdminClient thread should shut down.
                //检查 AdminClient 线程是否应该关闭
                long curHardShutdownTimeMs = hardShutdownTimeMs.get();
                if ((curHardShutdownTimeMs != INVALID_SHUTDOWN_TIME) && threadShouldExit(now, curHardShutdownTimeMs))
                    break;

                // Handle timeouts.
                //处理超时
                TimeoutProcessor timeoutProcessor = timeoutProcessorFactory.create(now);
                timeoutPendingCalls(timeoutProcessor);
                timeoutCallsToSend(timeoutProcessor);
                timeoutCallsInFlight(timeoutProcessor);

                long pollTimeout = Math.min(1200000, timeoutProcessor.nextTimeoutMs());
                if (curHardShutdownTimeMs != INVALID_SHUTDOWN_TIME) {
                    pollTimeout = Math.min(pollTimeout, curHardShutdownTimeMs - now);
                }

                // Choose nodes for our pending calls.为我们的待处理呼叫选择节点
                pollTimeout = Math.min(pollTimeout, maybeDrainPendingCalls(now));
                long metadataFetchDelayMs = metadataManager.metadataFetchDelayMs(now);
                if (metadataFetchDelayMs == 0) {
                    metadataManager.transitionToUpdatePending(now);
                    Call metadataCall = makeMetadataCall(now);
                    // Create a new metadata fetch call and add it to the end of pendingCalls.
                    //创建一个新的元数据获取调用并将其添加到 pendingCalls 的末尾
                    // Assign a node for just the new call (we handled the other pending nodes above).
                    //为新调用分配一个节点(我们处理了上面的其他待处理节点)。
                    if (!maybeDrainPendingCall(metadataCall, now))
                        pendingCalls.add(metadataCall);
                }
                pollTimeout = Math.min(pollTimeout, sendEligibleCalls(now));

                if (metadataFetchDelayMs > 0) {
                    pollTimeout = Math.min(pollTimeout, metadataFetchDelayMs);
                }

                // Ensure that we use a small poll timeout if there are pending calls which need to be sent
                //如果有待发送的呼叫需要发送,请确保我们使用一个小的轮询超时
                if (!pendingCalls.isEmpty())
                    pollTimeout = Math.min(pollTimeout, retryBackoffMs);

                // Wait for network responses.
                //等待网络响应
                log.trace("Entering KafkaClient#poll(timeout={})", pollTimeout);
                List<ClientResponse> responses = client.poll(Math.max(0L, pollTimeout), now);
                log.trace("KafkaClient#poll retrieved {} response(s)", responses.size());

                // unassign calls to disconnected nodes
                //取消对断开节点的调用
                unassignUnsentCalls(client::connectionFailed);

                // Update the current time and handle the latest responses.
                //更新当前时间并处理最新响应
                now = time.milliseconds();
                handleResponses(now, responses);
            }
        }

sendEligibleCalls 这个方法是实际调用的call的方法

 /**
         * Send the calls which are ready.
         *发送准备好的电话
         * @param now                   The current time in milliseconds.
         * @return                      The minimum timeout we need for poll().
         */
        private long sendEligibleCalls(long now) {
            long pollTimeout = Long.MAX_VALUE;
            for (Iterator<Map.Entry<Node, List<Call>>> iter = callsToSend.entrySet().iterator(); iter.hasNext(); ) {
                Map.Entry<Node, List<Call>> entry = iter.next();
                List<Call> calls = entry.getValue();
                if (calls.isEmpty()) {
                    iter.remove();
                    continue;
                }
                //省略。。。
                while (!calls.isEmpty()) {
                    Call call = calls.remove(0);
                    int timeoutMs = Math.min(remainingRequestTime,
                        calcTimeoutMsRemainingAsInt(now, call.deadlineMs));
                    AbstractRequest.Builder<?> requestBuilder;
                    try {
                       //获得call中的requestBuilder
                        requestBuilder = call.createRequest(timeoutMs);
                    } catch (Throwable t) {
                        call.fail(now, new KafkaException(String.format(
                            "Internal error sending %s to %s.", call.callName, node), t));
                        continue;
                    }
                    ClientRequest clientRequest = client.newClientRequest(node.idString(),
                        requestBuilder, now, true, timeoutMs, null);
                    log.debug("Sending {} to {}. correlationId={}, timeoutMs={}",
                        requestBuilder, node, clientRequest.correlationId(), timeoutMs);
                    //实际调用请求    
                    client.send(clientRequest, now);
                    callsInFlight.put(node.idString(), call);
                    correlationIdToCalls.put(clientRequest.correlationId(), call);
                    break;
                }
            }
            return pollTimeout;
        }

这里需要多注意一下requestBuilder = call.createRequest(timeoutMs); 这一行,下面getCreateTopicsCall才是requestBuilder 的初始化

(2)getCreateTopicsCall(创建发送创建topic的requestBuilder)

看完上面的runnable.call,下面接着看getCreateTopicsCall如何生成Call 的。

 private Call getCreateTopicsCall(final CreateTopicsOptions options,
                                     final Map<String, KafkaFutureImpl<TopicMetadataAndConfig>> futures,
                                     final CreatableTopicCollection topics,
                                     final Map<String, ThrottlingQuotaExceededException> quotaExceededExceptions,
                                     final long now,
                                     final long deadline) {
        return new Call("createTopics", deadline, new ControllerNodeProvider()) {
            @Override
            public CreateTopicsRequest.Builder createRequest(int timeoutMs) {
                return new CreateTopicsRequest.Builder(
                    new CreateTopicsRequestData()
                        .setTopics(topics)
                        .setTimeoutMs(timeoutMs)
                        .setValidateOnly(options.shouldValidateOnly()));
            }

            @Override
            public void handleResponse(AbstractResponse abstractResponse) {
              //省略..
            }

            private ConfigEntry configEntry(CreatableTopicConfigs config) {
                return new ConfigEntry(
                    config.name(),
                    config.value(),
                    configSource(DescribeConfigsResponse.ConfigSource.forId(config.configSource())),
                    config.isSensitive(),
                    config.readOnly(),
                    Collections.emptyList(),
                    null,
                    null);
            }

            @Override
            void handleFailure(Throwable throwable) {
                // If there were any topics retries due to a quota exceeded exception, we propagate
                // the initial error back to the caller if the request timed out.
                maybeCompleteQuotaExceededException(options.shouldRetryOnQuotaViolation(),
                    throwable, futures, quotaExceededExceptions, (int) (time.milliseconds() - now));
                // Fail all the other remaining futures
                completeAllExceptionally(futures.values(), throwable);
            }
        };
    }

其中new ControllerNodeProvider() 返回的是controller列表,这样的话相当于服务端是用controller接收的,

 /**
     * Provides the controller node.
     * 提供控制器节点
     */
    private class ControllerNodeProvider implements NodeProvider {
        @Override
        public Node provide() {
            if (metadataManager.isReady() &&
                    (metadataManager.controller() != null)) {
                return metadataManager.controller();
            }
            metadataManager.requestUpdate();
            return null;
        }
    }

3、服务端创建topic的请求(handleCreateTopicsRequest)

(1)这里先看一下kafka集群启动时的操作

为什么要加这一步?
主要是因为从kafka2.8开始,除了zk我们又有新的选择,用kraft来做zk的工作,并被称为革命性的,但是旧的zk其实没有被废弃,只是提供了新的选择

可以去看我另一篇文章:kafka 2.8 如何选择启用kraft还是ZooKeeper(选择逻辑源码,不涉及到kraft的实现)

(2)在初始化KafkaApis时如何选择是zk的还是raft的

在启动kafka时,会调用startup做初始化
后面只演示KafkaRaftServer

def startup(): Unit = {
	//省略
	 // Create the request processor objects.
      //创建请求处理器对象,这里需要特别注意raftSupport 和在new KafkaApis中参数的位置。
      val raftSupport = RaftSupport(forwardingManager, metadataCache)
      dataPlaneRequestProcessor = new KafkaApis(socketServer.dataPlaneRequestChannel, raftSupport,
        replicaManager, groupCoordinator, transactionCoordinator, autoTopicCreationManager,
        config.nodeId, config, metadataCache, metadataCache, metrics, authorizer, quotaManagers,
        fetchManager, brokerTopicStats, clusterId, time, tokenManager, apiVersionManager)

      dataPlaneRequestHandlerPool = new KafkaRequestHandlerPool(config.nodeId,
        socketServer.dataPlaneRequestChannel, dataPlaneRequestProcessor, time,
        config.numIoThreads, s"${SocketServer.DataPlaneMetricPrefix}RequestHandlerAvgIdlePercent",
        SocketServer.DataPlaneThreadPrefix)
        //省略
 }       

这里看一下KafkaApis 的构造代码,可以认为服务端是controller的列表,在KafkaApis.scala 文件中


/**
 * Logic to handle the various Kafka requests
 */
class KafkaApis(val requestChannel: RequestChannel,
                val metadataSupport: MetadataSupport,
                val replicaManager: ReplicaManager,
                val groupCoordinator: GroupCoordinator,
                val txnCoordinator: TransactionCoordinator,
                val autoTopicCreationManager: AutoTopicCreationManager,
                val brokerId: Int,
                val config: KafkaConfig,
                val configRepository: ConfigRepository,
                val metadataCache: MetadataCache,
                val metrics: Metrics,
                val authorizer: Option[Authorizer],
                val quotas: QuotaManagers,
                val fetchManager: FetchManager,
                brokerTopicStats: BrokerTopicStats,
                val clusterId: String,
                time: Time,
                val tokenManager: DelegationTokenManager,
                val apiVersionManager: ApiVersionManager) extends ApiRequestHandler with Logging 

其中第二个位置MetadataSupport,在startup中是raftSuppert,所以后面的源码如果出现MetadataSupport调用如果获得的字段是带zk,不要认为就是zk相关的,其实是raft

创建的客户端发送的创建topic请求是由handleCreateTopicsRequest接收处理,

 /**
   * Top-level method that handles all requests and multiplexes to the right api
   * 处理所有请求并多路复用到正确 api 的顶级方法
   */
  override def handle(request: RequestChannel.Request, requestLocal: RequestLocal): Unit = {
    try {
    	//省略。。。
      request.header.apiKey match {
      //省略。。
       case ApiKeys.CREATE_TOPICS => maybeForwardToController(request, handleCreateTopicsRequest)
       //省略。。
      }
    } catch {
     //省略
    } finally {
      //省略
    }
  }

maybeForwardToController 这个就不多做解释,直接看handleCreateTopicsRequest

def handleCreateTopicsRequest(request: RequestChannel.Request): Unit = {
    //虽然字段名是zkSupport 但实际上是raftSupport,原因看3(1)
    val zkSupport = metadataSupport.requireZkOrThrow(KafkaApis.shouldAlwaysForward(request))
	  //省略
    val createTopicsRequest = request.body[CreateTopicsRequest]
    val results = new CreatableTopicResultCollection(createTopicsRequest.data.topics.size)
    //如果当前Broker不是属于Controller的话,就抛出异常
    if (!zkSupport.controller.isActive) {
      createTopicsRequest.data.topics.forEach { topic =>
        results.add(new CreatableTopicResult().setName(topic.name)
          .setErrorCode(Errors.NOT_CONTROLLER.code))
      }
      sendResponseCallback(results)
    } else {	  
    //省略
      zkSupport.adminManager.createTopics(
        createTopicsRequest.data.timeoutMs,
        createTopicsRequest.data.validateOnly,
        toCreate,
        authorizedForDescribeConfigs,
        controllerMutationQuota,
        handleCreateTopicsResults)
 }

zkSupport.adminManager.createTopics这里面是实际的调用,

(3)、这里卡住了,

1、zkSupport.adminManager.createTopics为什么走的是ZkAdminManager中的createTopics方法,不应该有个RaftAdminManager的吗?
2、val zkSupport = metadataSupport.requireZkOrThrow 的实现


case class ZkSupport(adminManager: ZkAdminManager,
                     controller: KafkaController,
                     zkClient: KafkaZkClient,
                     forwardingManager: Option[ForwardingManager],
                     metadataCache: ZkMetadataCache) extends MetadataSupport {
  val adminZkClient = new AdminZkClient(zkClient)

  override def requireZkOrThrow(createException: => Exception): ZkSupport = this
  override def requireRaftOrThrow(createException: => Exception): RaftSupport = throw createException

  override def ensureConsistentWith(config: KafkaConfig): Unit = {
    if (!config.requiresZookeeper) {
      throw new IllegalStateException("Config specifies Raft but metadata support instance is for ZooKeeper")
    }
  }

  override def maybeForward(request: RequestChannel.Request,
                            handler: RequestChannel.Request => Unit,
                            responseCallback: Option[AbstractResponse] => Unit): Unit = {
    forwardingManager match {
      case Some(mgr) if !request.isForwarded && !controller.isActive => mgr.forwardRequest(request, responseCallback)
      case _ => handler(request)
    }
  }

  override def controllerId: Option[Int] =  metadataCache.getControllerId
}

case class RaftSupport(fwdMgr: ForwardingManager, metadataCache: KRaftMetadataCache)
    extends MetadataSupport {
  override val forwardingManager: Option[ForwardingManager] = Some(fwdMgr)
  override def requireZkOrThrow(createException: => Exception): ZkSupport = throw createException
  override def requireRaftOrThrow(createException: => Exception): RaftSupport = this

  override def ensureConsistentWith(config: KafkaConfig): Unit = {
    if (config.requiresZookeeper) {
      throw new IllegalStateException("Config specifies ZooKeeper but metadata support instance is for Raft")
    }
  }

  override def maybeForward(request: RequestChannel.Request,
                            handler: RequestChannel.Request => Unit,
                            responseCallback: Option[AbstractResponse] => Unit): Unit = {
    if (!request.isForwarded) {
      fwdMgr.forwardRequest(request, responseCallback)
    } else {
      handler(request) // will reject
    }
  }

如果是zk模式则是ZkSupport下的requireZkOrThrow 还好理解,如果是raft则是RaftSupport的requireZkOrThrow,那override def requireZkOrThrow(createException: => Exception): ZkSupport = throw createException 返回给handleCreateTopicsRequestzkSupport 还用继续走下面的zkSupport.adminManager.createTopics 吗?

有知道的给个解释吧?或者以后再看看

你可能感兴趣的:(#,kafka,kafka,scala,big,data)