分布式消息中间件(七)——Kafka安装及配置详解(Linux)

一、Zookeeper集群准备

Kafka服务有赖于Zookeeper来管理,故在安装kakfa前先安装zk集群环境。具体安装步骤,详见 Linux 系列(七)——Zookeeper集群搭建

二、Kafka核心配置解析 

1、上传压缩包,解压

2、解压后,cd /lamp/kafka/config文件下,查看kafka所有相关配置

-rw-r--r--. 1 root root  906 May 17 21:26 connect-console-sink.properties
-rw-r--r--. 1 root root  909 May 17 21:26 connect-console-source.properties
-rw-r--r--. 1 root root 2760 May 17 21:26 connect-distributed.properties
-rw-r--r--. 1 root root  883 May 17 21:26 connect-file-sink.properties
-rw-r--r--. 1 root root  881 May 17 21:26 connect-file-source.properties
-rw-r--r--. 1 root root 1074 May 17 21:26 connect-log4j.properties
-rw-r--r--. 1 root root 2061 May 17 21:26 connect-standalone.properties
-rw-r--r--. 1 root root 1199 May 17 21:26 consumer.properties
-rw-r--r--. 1 root root 4369 May 17 21:26 log4j.properties
-rw-r--r--. 1 root root 1900 May 17 21:26 producer.properties
-rw-r--r--. 1 root root 5243 May 17 21:26 server.properties
-rw-r--r--. 1 root root 1032 May 17 21:26 tools-log4j.properties
-rw-r--r--. 1 root root 1041 Jun 18 11:19 zookeeper.properties
zookeeper.properties
# the directory where the snapshot is stored.
dataDir=/usr/local/cloud/zookeeper1/data
# the port at which the clients will connect
clientPort=2181
# disable the per-ip limit on the number of connections since this is a non-production config
maxClientCnxns=0
server.properties

############################# Server Basics 指定broker ID,每个kafka实例拥有一个唯一brokerID作为标识#############################

# The id of the broker. This must be set to a unique integer for each broker.
broker.id=0
############################# Log Basics  kafka日志持久化日志文件路径,生产环境指定到非tmp临时文件夹下#############################

# A comma seperated list of directories under which to store log files
log.dirs=/tmp/kafka-logs

# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=1  # 配置本broker topic 分区个数

# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
# This value is recommended to be increased for installations with data dirs located in RAID array.
num.recovery.threads.per.data.dir=1
############################# Zookeeper #############################

# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=192.168.220.128:2181 #zk服务地址

# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000  #zk响应过期时间

     server.properties是kafka broker的配置文件,主要用于配置broker的日志、分区、zk连接信息。除了上述提供的主要配置之外,还有Log Retention Policy日志保留策略(以总时间、segment大小、间隔时间配置)、Log Flush Policy 日志flush策略 ,日志中存有10000条记录开始持久化操作、每隔1000ms执行持久化,通过配置可调可配,非常灵活。

producer.properties

############################# Producer Basics #############################

# list of brokers used for bootstrapping knowledge about the rest of the cluster
# format: host1:port1,host2:port2 ...
bootstrap.servers=192.168.220.128:9092

# specify the compression codec for all data generated: none, gzip, snappy, lz4
compression.type=none
consumer.properties

# Zookeeper connection string
# comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002"
zookeeper.connect=127.0.0.1:2181

# timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000

#consumer group id
group.id=test-consumer-group

#consumer timeout
#consumer.timeout.ms=5000
三、Kafka常用操作

1、kafka启动

    启动kafka前先启动zk服务:  bin/zookeeper-server-start.sh  /usr/local/cloud/zookeeper1/conf/zoo.cfg --指定zk的配置文件

    启动kafka: bin/kafka-server-start.sh /lamp/kafka/config/server.properties >/dev/null 2>&1 &  --后台启动kafka

    执行jsp查看启动进程如下

分布式消息中间件(七)——Kafka安装及配置详解(Linux)_第1张图片

2、创建topic

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic max


在创建完topic后,可查看kafka日志持久化路径,已经创建好刚才的log日志文件用于记录topic中parttion信息。(log :partition=1:1)

3、查看topic

查看topic中list:bin/kafka-topics.sh --list --zookeeper localhost:2181  --返回以创建的topic名称,如max,linxi 

查看某topic 详情信息:bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic max


4、删除topic

bin/kafka-run-class.sh kafka.admin.TopicCommand –delete --topic max --zookeeper 192.168.220.128:2181

5、创建生产者

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic max  向名为max的topic中生产消息


6、创建消费者

bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic max --from-beginning


生产者和消费间消息是实时的,这边生产,另一端立马开始消费。

另外在zk中记录了partion消息的消费记录信息,进入zk客户端连接服务,进行查看

分布式消息中间件(七)——Kafka安装及配置详解(Linux)_第2张图片

三、Kafka集群

     Kafka的集群与zk集群类似,灰常简单,因为已经搭建好zk集群,kafka只需创建多个实例,修改上述server.properties中broker.id 不重复即可,and 在zk connection处添加上三个zk实例即可。

四、客户端实现

/*
*消息生产者
*/
public class Producer extends Thread{
	private String topic;
	public Producer(String topic)
	{
		super();
		this.topic=topic;
	}
	
	@Override
	public void run() {
		Producer producer=createProducer();
		producer.send(new KeyedMessage<>(topic, "xiaowei"));
	}
	
	private Producer createProducer()
	{
		Properties props=new Properties();
		props.setProperty("zookeeper.connect", "192.168.220.128:2181");
		props.setProperty("serializer.class", StringEncoder.class.getName());
		props.setProperty("metadata.broker.list", "192.168.220.128:9092");//p.properties的配置中的端口号
		return new Producer(new ProducerConfig(props));
	}
	
	public static void main(String[] args) {
		new Producer("max").start();
	}
}

/*
*消息消费者
*/
public class Consumer  extends Thread{
	private String topic;	
	public Consumer(String topic)
	{
		super();
		this.topic=topic;
	}
	@Override
	public void run() {
		//通过props创建consumer连接		
		ConsumerConnector consumer=createConsumer();
		//创建topicMap存放指定消费的topic名,分区号
		Map<String,Integer> topicMap=new HashMap<String,Integer>();
		topicMap.put(topic, 1);
		
		//msgstreams-topic中的所有消息,kafkaStream-某一条消息
		Map<String,List<KafkaStream<byte[],byte[]>>> msgStreams=consumer.createMessageStreams(topicMap);
		KafkaStream<byte[],byte[]> kafkaStream=msgStreams.get(topic).get(0);
		
		//数据处理过程
		ConsumerIterator<byte[],byte[]> iterator=kafkaStream.iterator();
		while(iterator.hasNext())
		{
			byte[] msg=iterator.next.message();
			System.out.println("message is" +new String(msg));
		}
	}
	private ConsumerConnector createConsumer()
	{
		Properties props=new Properties();
		props.setProperty("zookeeper.connect", "192.168.220.128:2181");
		props.setProperty("group.id", "max");
		return Consumer.createJavaConsumerConnector(new Consumer());
		
	}
	public static void main(String[] args)
	{
		new Consumer("max").start();
	}
}
    Kafka实际应用的消费者和生产者都属于某些项目或系统,例如在大数据应用中,使用flume收集信息,flume的sink配置kafka接收,kafka高速接收缓冲这些数据,storm从kafka获取,进行数据处理;实际flume就充当kafka的消息生产者,storm为消息消费者。


你可能感兴趣的:(分布式消息中间件(七)——Kafka安装及配置详解(Linux))