sparkstreaming写入kafka的性能优化

在实际的项目中,有时候我们需要把一些数据实时的写回到kafka中去,一般的话我们是这样写的,如下:

kafkaStreams.foreachRDD(rdd => {
      if (!rdd.isEmpty()) {
        rdd.foreachPartition(pr => {
          val properties = new Properties()
          properties.put("group.id", "jaosn_")
          properties.put("acks", "all")
          properties.put("retries ", "1")
          properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, PropertiesScalaUtils.loadProperties("broker"))
          properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, classOf[StringSerializer].getName) //key的序列化;
          properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, classOf[StringSerializer].getName)
          val producer = new KafkaProducer[String, String](properties)
          pr.foreach(pair => {
            producer.send(new ProducerRecord(PropertiesScalaUtils.loadProperties("topic") + "_error", pair.value()))
          })
        })
      }
    })

你可能感兴趣的:(Spark,kafka)