大数据入门之分布式消息队列Kafka(2) -- Kafka Java API编程使用与Flume整合

1.前置要求

JDK1.8

Maven3.5.2

2.Java API编程实现

打开IDEA,选择构建一个maven项目,首先我们修改pom.xml文件。

    
        2.11.8
        0.8.2.1
    

    
        
            org.scala-lang
            scala-library
            ${scala.version}
        

        
        
            org.apache.kafka
            kafka_2.11
            ${kafka.version}
        
    

先写一下,Kafka的配置信息。

/**
 * Kafka常用配置文件
 */
public class KafkaProperties {

    public static final String ZK = "hadoop000:2181";

    public static final String TOPIC = "hello_topic";

    public static final String BROKER_LIST = "hadoop000:9092";

    public static final String GROUP_ID = "test_group1";
}

然后定义我们的生产者。

import kafka.javaapi.producer.Producer;
import kafka.producer.KeyedMessage;
import kafka.producer.ProducerConfig;

import java.util.Properties;

/**
 * Kafka生产者
 */
public class KafkaProducer extends Thread{

    private String topic;

    private Producer producer;

    public KafkaProducer(String topic) {
        this.topic = topic;

        Properties properties = new Properties();
        properties.put("metadata.broker.list", KafkaProperties.BROKER_LIST);
        properties.put("serializer.class", "kafka.serializer.StringEncoder");
        properties.put("request.required.acks", "1");  // 当broker收到信息,回传一个ack

        producer = new Producer(new ProducerConfig(properties));
    }

    @Override
    public void run() {
        int messageNo = 1;

        while (true) {
            String message = "message_" + messageNo;
            producer.send(new KeyedMessage(topic, message));
            System.out.println("Send " + message);
            messageNo ++;

            try {
                Thread.sleep(2000);
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }
}

我们来写个测试类。

/**
 * Kafka Java API 测试
 */
public class KafkaClientApp {

    public static void main(String[] args) {

        new KafkaProducer(KafkaProperties.TOPIC).start();

        //new KafkaConsumer(KafkaProperties.TOPIC).start();
    }
}

运行发现,在控制台上有“send message”的输出信息。

接下来,我们写一个消费者。

import kafka.consumer.Consumer;
import kafka.consumer.ConsumerConfig;
import kafka.consumer.ConsumerIterator;
import kafka.consumer.KafkaStream;
import kafka.javaapi.consumer.ConsumerConnector;

import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Properties;

/**
 * Kafka 消费者
 */
public class KafkaConsumer extends Thread{

    private String topic;

    public KafkaConsumer(String topic) {
        this.topic = topic;

    }

    private ConsumerConnector createConnector() {
        Properties properties = new Properties();
        properties.put("zookeeper.connect", KafkaProperties.ZK);
        properties.put("group.id", KafkaProperties.GROUP_ID);
        return Consumer.createJavaConsumerConnector(new ConsumerConfig(properties));
    }

    @Override
    public void run() {
        ConsumerConnector consumer = createConnector();

        Map topicCountMap = new HashMap();
        topicCountMap.put(topic, 1);
//        topicCountMap.put(topic2, 1);
//        topicCountMap.put(topic3, 1);

        // String: topic
        //List> 对应的数据流
        Map>> messageStreams = consumer.createMessageStreams(topicCountMap);

        KafkaStream stream = messageStreams.get(topic).get(0);  //获取我们每次接受到的数据

        ConsumerIterator iterator = stream.iterator();

        while (iterator.hasNext()) {
            String message = new String(iterator.next().message());
            System.out.println("rec: " + message);
        }
    }
}

然后修改测试类,并启动。

/**
 * Kafka Java API 测试
 */
public class KafkaClientApp {

    public static void main(String[] args) {

        new KafkaProducer(KafkaProperties.TOPIC).start();

        new KafkaConsumer(KafkaProperties.TOPIC).start();
    }
}

此时发现控制台上,有send的打印信息也有receive的打印信息。

3.Kafka整合Flume

avro-memory-kafka.conf

avro-memory-kafka.sources = avro-source
avro-memory-kafka.sinks = kafka-sink
avro-memory-kafka.channels = memory-channel

avro-memory-kafka.sources.avro-source.type = avro
avro-memory-kafka.sources.avro-source.bind = hadoop000
avro-memory-kafka.sources.avro-source.port = 44444

avro-memory-kafka.sinks.kafka-sink.type = org.apache.flume.sink.kafka.KafkaSink
avro-memory-kafka.sinks.kafka-sink.brokerList = hadoop000:9092
avro-memory-kafka.sinks.kafka-sink.topic = kafka_streaming_topic
avro-memory-kafka.sinks.kafka-sink.batchSize = 5
avro-memory-kafka.sinks.kafka-sink.requiredAcks = 1

avro-memory-kafka.channels.memory-channel.type = memory

avro-memory-kafka.sources.avro-source.channels = memory-channel
avro-memory-kafka.sinks.kafka-sink.channel = memory-channel

流程:

先启动Kafka;

kafka-server-start.sh -daemon /home/Kiku/app/kafka_2.11-0.9.0.0/config/server.properties

启动flume agent

先启动avro-memory-kafka.conf,因为它是监听在44444端口之上,所以要先启动起来
flume-ng agent \
--name avro-memory-kafka \
--conf $FLUME_HOME/conf \
--conf-file $FLUME_HOME/conf/avro-memory-kafka.conf \
-Dflume.root.logger=INFO,console
再启动exec-memory-avro
flume-ng agent \
--name exec-memory-avro \
--conf $FLUME_HOME/conf \
--conf-file $FLUME_HOME/conf/exec-memory-avro.conf \
-Dflume.root.logger=INFO,console

 

你可能感兴趣的:(大数据,Kafka,大数据,Kafka)