kafka-零字节拷贝
1.数据从内核复制到套接字缓冲区
2.从套接字缓冲区复制到NIC(网络适配器)缓冲区--网络传输
传统:
1.数据从磁盘读取到内核空间的pagecache中
2.应用程序从内核空间读取数据到用户空间缓冲区
3.应用程序将数据从内核空间复制到套接字缓冲区
4.从套接字缓冲区复制到NIC(网络适配器)缓冲区
Spark Streaming + Kafka 整合
Receiver-based Approach
1.Kafka 的topic分区 和 Spark Streaming 中生成的RDD分区没有关系.
2.KafkaUtils.createStream中增加的分区数量智慧增加单个receiver 的线程数,不会增加spark 的并行度.
3.可以创建多个Kafka的输入DStream,使用不同的group和topic,使用多个receiver 并行接受数据(提高spark并行度)
4.如果启动hdfs等容错性存储系统,并启用写入日志,则接收到的数据已经被复制到日志中.
因此,输入流的存储级别设置StorageLevel.MEMORY_AND_DISK_SER (即使用KafkaUtils.createStream(...,Storage.MEMORY_AND_DISK_SER)的存储级别
Direct Approach (No receivers: 直连方式,无receiver进行接收消息)
简化的并行性:不需要创建多个输入kafka流 并将其合并.使用directStream,Spark Streaming 将创建与使用Kafka分区一样多的RDD分区(提供1:1的kakfa,Rdd分区),这些分区将全部从kfaka并行读取数据.所以在kafka和RDD分区之间有一对一的映射关系; RDD分区 和 kafka分区一一对应
import org.apache.kafka.clients.consumer.ConsumerRecord
import org.apache.kafka.common.serialization.StringDeserializer
import org.apache.spark.streaming.kafka010._
import org.apache.spark.streaming.kafka010.LocationStrategies.PreferConsistent
import org.apache.spark.streaming.kafka010.ConsumerStrategies.Subscribe
val kafkaParams = Map[String,Object](
"bootstrap.servers"->"localhost:9092,anotherhost:9092"
"key.deserializer"->classOf[StringDeserializer], //使用 StringDeserializer进行key,value反序列化
"val.deserializer"->classOf[StringDeserializer],
"group.id"->"为每个stream使用一个分割group_id",
"auto.offset.reset"->"latest", //自动重置为最新偏移量
"enable.auto.commit"->(false:java.lang.Boolean)
)
val topics = Array("topicA","topicB")
val stream = KafkaUtils.createDirectStream[String,String](
streamingContext,
PreferConsistent,
Subscribe[String,String](topics,KafkaParams)
)
stream.map(record =>(record.key,record.value))
创建定义offset范围的RDD,用于批处理
val offsetRanges = Array(
//topic, partition,inclusive offset,exclusive ending offset
OffsetRange("test",0,0,100) //从0分区,读取offset 0-99
OffsetRange("test",1,0,100) //从1分区,读取offset 0-99
)
val rdd = KafkaUtils.createRDD[String,String](sparkContext,kafkaParams,offsetRanges,PreferConsistent)
Obtaining Offsets
stream.foreachRDD{ rdd =>
val offsetRanges = rdd.asInstanceOf[HasOffsetRanges].offsetRanges
rdd.foreachPartition{ iter=>
val o:OffsetRange = offsetRanges(TaskContext.get.partitionId)
println(s"${o.topic},${o.partition},${o.fromOffset}${o.untilOffset}")
}
}
效率: 在第一种方法中实现另数据丢失时,需要将数据存储在预写日志中,这回进一步复制数据.实际是效率地下--数据被复制两次,一次是kafka,另一次写入预写日志(Write Ahead Log)复制.直连方式消除了这个方式,无Receiver,不需要预先写 预写日志(WAL),前提是 kafka数据保留时间足够长.
Exactly-once:
1-Receiver: 使用kafka的高级API来在Zookeeper中存储消耗的偏移两,传统上这是kafkfa消费数据方式.可以通过结合WAL来确保零数据丢失(at least once),但是存在失败情况下,消息被重复消费的问题.发生这种情况是因为Spark Streaming 可靠接收的数据与Zookeeper 跟踪的偏移之间的不一致.
2-direct: 不使用zookeeper跟踪消费记录的偏移量,在其检查点内,spark Streaming 跟踪偏移量.这消除了Spark Streaming 和 zookeeper 的读取和跟踪offset不一致.因此Spark Streaming每次记录都会在发生故障的情况下有效收到一次.为了在发生故障下也能保证输出结果的一次语义,讲数据保存到外部数据存储区的输出操作必须是幂等(每次输出都是相同数据)
或者保存结果和偏移量的原子事务.
使用外部存储保存offset
(一) checkpoint
1.启用Spar Streaming 的checkpoint 是存储偏移量最简单的方法(仍然会不一致:记录偏移量后,未成功处理)
缺点:
1.Spark无法跨应用程序进行恢复
2.Spark升级将无法导致恢复
3.在关键生产应用,不建议使用spark 检查管理offset(软件升级,无法跨应用)
2.流式checkpoint专门用户保存应用程序的状态,比如保存HDFS上,在故障时能恢复
与zk,hbase相比,hdfs有更高延迟;如管理不当,在hdfs上写入每个批次的offsetRanges可能会导致小文件问题
(二).Hbase
1.基于Hbase 的通用设计,使用同一张标保存可以跨越多个spark streaming 的程序topic的offset
2.rowkey = topicName + groupid + batchTiemOfstreaming (miliSeconds) :尽管batchtime.miliSeconds不是必须的,但是可以看到历史的批处理任何对offset的管理情况.
3.kafka 的offset保存在如下标,30天自动过期
create 'spark_kafka_offsets',{NAME=>'offsets',TTL=>2592000}
4.offset获取场景
场景1:Streaming作业首次启动,通过zookeeper来查找给定topic中分区数量,然后返回"0"作为所有topic分区的offset
场景2: 长时间运行的Streaming作业已经停止,新的分区被添加到kafka 的topic中,通过zookeeper 来查找给顶topic中分区的数量;对于所有旧的topic分区,将offset 设置为HBase 中的最新偏移量,对于所有新的topic,她将分会"0"作为offset.
场景3:长时间运行的Streaming已经停止,topic没有任何更改.此种情况下,Hbase中发现最新的偏移量作为每个topic分区的offset返回.
hbase (main):009:0 > scan 'spark_kafka_offsets'
stream.foreachRDD{
rdd =>
rdd.asInstanceOf[HasOffsetRanges].offsetRanges
//some tim later,after outputs have completed
stream.asInstanceOf[CanCommitOffsets].commitAsync(OffsetRanges)
}
(三).Zookeeper
1.路径:
val zkPath = s"${kafkaOffsetRootPath}/${groupName}/${o.topic}/${o.partition}"
2.如果zookeeper中未保存offset,根据kafkaParam 配置使用最新或者最旧的offset
3.如果zookeeper中保存offset 我们会利用这个offset作为kfakaStream的起始位置
zkCli 登陆
ls /kafka0.9/mykafka/consumer/offsets/testp/mytest1
get /kafka0.9/mykafka/consumer/offsets/testp/mytest1/0
缺点:如果是hadoop,hive,spark,hbase都是集群方式部署,依赖zookeeper,就会给本来负载较大的zookeeper带来更大的压力,容易造成zookeeper故障,影响集群正常工作
(四).kafka
kafka自身通过 enable.auto-commit 参数定期体检offset,可以保证offset存储
但是仍有问题:当自定义批处理业务未成功产生spark operation,消息读取发生污染,导致一个未定义语义.所以spark默认该功能禁用(enable.auto-commit=false)
当然可以可以使用commitAsync API,与checkpoint相比的好处是,kafka 是定期,无视应用代码改变(checkpoint敏感代码改变,重新指定offset)保存offset,但是kafka非事务处理,必须仍然像
checkpoint保证幂等性.
stream.foreachRDD { rdd =>
val offsetRanges = rdd.asInstanceOf[HashOffsetRanges].offsetRanges
//some time later,after outputs have completed
stream.asInstanceOf[CanCommitOffsets].commitAsync(offsetRanges) //CanCommitOffsets仅在 CreateStream结果上调用返回成功,而非transformation之后
// commitAsync调用是线程安全,但是必须发生在 createStream结果输出之后,如果要有意义语义.
}
(五)自身业务存储:
对于支持事务处理的数据存储,在相同的事务中保存offset,即使在fail 情况下,仍然能够保持两者同步,一致.
如果你不关心探测重复,跳跃的offset 范围,回滚事务阻止重复消息提交,丢失消息影响.此能保证 exactly once semantic (仅且一次语义)
对于聚合操作结果往往很难保证幂等性, 使用此策略为聚合操作结果也是可能的.
//the details depend on your data store ,but the general ideal look this,但这都是理想情况,很少消息数据是需要事务支持的
//begin from the offsets commited to the database
val fromOffsets = selectOffsetsFromYourDatabase.map{ resultSet=> //offset topic+partition+batchTimeOfMiliseconds
new TopicPartition(resultSet.string("topic"),resultSet.int("partition")) -> resultSet.long("offset")
}.toMap
val stream = KafkaUtils.createDirectStream[String,String](
streamingContext,
PreferConsistent,
Assign[String,String](fromOffsets.keys.toList,kafkaParams,fromOffsets)
)
stream.foreachRDD{ rdd =>
val offsetRanges = rdd.asInstanceOf[HasOffsetRanges].offsetRanges
val results = yourCalculation(rdd)
//begin transaction
//insert,update results
// update offset where the end of existing offsets matches the begining of this batch of offset
//assert that offsets were updated correctly
//end your transaction
SSL/TLS :spark 与 kafka 之间的communication
val kafkaParams = Map[String,Object](
//the usual params ,make sure to change the port in bootstrap.servers if 9002 is not TLS
"security.protocol" -> "SSL"
"ssl.truststore.location" -> "/some-directory/kafka.client.truststore.jks"
"ssl.truststore.password" -> "test1234"
"ssl.keystore.location" -> "/some-directory/kafka.client.keystore.jks"
"ssl.keystore.password" -> "test1234"
"ssl.key.password" -> "test1234"
)
}
(六) 不保存kafka offset :容忍部分数据丢失
(七)根据业务需求是否管理offset
1. 如实时活动监控,只需当前最新数据,不需要管理offset;此情况下使用之前 Low-level API,将参数 auto.offset.reset 设置为largest或者smallest.新的API:auto.offset.reset="earliest"
一个brokder 对应一个brokerid, server.properties
kafka 安装,一个机器可以对应一个或多个broker,根据需求,生产业务进行制定
1. 下载spark,hdfs相匹配的kafka版本
2.解压
tar xzvf kafka.tar.gz
3.解压安装(简略) ,配置好zookeeper zoo.config
创建zk data,log目录
sudo mkdir /usr/cdh/spark/zkdata/
sudo mkdir /usr/cdh/spark/zkdata/zklogs
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
#zookeeper logs
dataDir=/usr/cdh/zookeeper/data/
#hadoop zk logs
dataDir=/usr/cdh/hadoop/zkdata
dataLogDir=/usr/cdh/hadoop/zkdata/zklogs
#spark zk logs
dataDir=/usr/cdh/spark/zkdata/
dataLogDir=/usr/cdh/spark/zkdata/zklogs
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
#assign servral hostname:port to server.id
# 以下配置在集群(多台机器) 中
#server.0=Master:2888:3888
#server.1=Worker1:2888:3888
#server.2=Worker2:2888:3888
4.对应用户下,在配置文件中配置kafka对应环境变量
vim .profile
export KAFKA_HOME=/usr/cdh/kafka
export PATH=$PATH:$KAFKA_HOME/bin:$SCALA_HOME/bin:$JAVA_HOME/bin
生效配置
. .profile
5.配置kafka broker 配置文件(一个配置文件对应一个broker)
每个 server.properties 的 broker.id,listeners , logdirs 不能相同
server.properties
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# see kafka.server.KafkaConfig for additional details and defaults
############################# Server Basics #############################
# The id of the broker. This must be set to a unique integer for each broker.
# 每一个broker对应一个broker.id ,kafka集群不可有重复id,否则重复的无法启动
broker.id=0
############################# Socket Server Settings #############################
#broker 如果存在同一台机器broker ,则每个broker监听端口不能相同,否则第二个重复broker.id无法启动
listeners=PLAINTEXT://:9092
# The port the socket server listens on
#port=9092
# Hostname the broker will bind to. If not set, the server will bind to all interfaces
#host.name=localhost
# Hostname the broker will advertise to producers and consumers. If not set, it uses the
# value for "host.name" if configured. Otherwise, it will use the value returned from
# java.net.InetAddress.getCanonicalHostName().
#advertised.host.name=
# The port to publish to ZooKeeper for clients to use. If this is not set,
# it will publish the same port that the broker binds to.
#advertised.port=
# The number of threads handling network requests
num.network.threads=3
# The number of threads doing disk I/O
num.io.threads=8
# The send buffer (SO_SNDBUF) used by the socket server
socket.send.buffer.bytes=102400
# The receive buffer (SO_RCVBUF) used by the socket server
socket.receive.buffer.bytes=102400
# The maximum size of a request that the socket server will accept (protection against OOM)
socket.request.max.bytes=104857600
############################# Log Basics #############################
# A comma seperated list of directories under which to store log files
# 存储kakfa 消息文件的目录,非常重要;同样如果一台机器有多个broker,该目录不能相同,否则仍是上述问题
log.dirs=/tmp/kafka-logs
# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=1
# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
# This value is recommended to be increased for installations with data dirs located in RAID array.
num.recovery.threads.per.data.dir=1
############################# Log Flush Policy #############################
# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
# 1. Durability: Unflushed data may be lost if you are not using replication.
# 2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
# 3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to exceessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.
# The number of messages to accept before forcing a flush of data to disk
#log.flush.interval.messages=10000
# The maximum amount of time a message can sit in a log before we force a flush
#log.flush.interval.ms=1000
############################# Log Retention Policy #############################
# The following configurations control the disposal of log segments. The policy can
# be set to delete segments after a period of time, or after a given size has accumulated.
# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
# from the end of the log.
# The minimum age of a log file to be eligible for deletion
log.retention.hours=168
# A size-based retention policy for logs. Segments are pruned from the log as long as the remaining
# segments don't drop below log.retention.bytes.
#log.retention.bytes=1073741824
# The maximum size of a log segment file. When this size is reached a new log segment will be created.
log.segment.bytes=1073741824
# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000
############################# Zookeeper #############################
# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
#存储为kafka连接zookeeperurl,推荐应该在url后面多加个目录,这样可以将kafka各组件目录放在一起
zookeeper.connect=hadoop:2181/kafka0.9
# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000
同一台机器第二个broker server1.properties
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# see kafka.server.KafkaConfig for additional details and defaults
############################# Server Basics #############################
# The id of the broker. This must be set to a unique integer for each broker.
# 每一个broker对应一个broker.id ,kafka集群不可有重复id,否则重复的无法启动
broker.id=1
############################# Socket Server Settings #############################
#broker 如果存在同一台机器broker ,则每个broker监听端口不能相同,否则第二个重复broker.id无法启动
listeners=PLAINTEXT://:19092
# The port the socket server listens on
#port=9092
# Hostname the broker will bind to. If not set, the server will bind to all interfaces
#host.name=localhost
# Hostname the broker will advertise to producers and consumers. If not set, it uses the
# value for "host.name" if configured. Otherwise, it will use the value returned from
# java.net.InetAddress.getCanonicalHostName().
#advertised.host.name=
# The port to publish to ZooKeeper for clients to use. If this is not set,
# it will publish the same port that the broker binds to.
#advertised.port=
# The number of threads handling network requests
num.network.threads=3
# The number of threads doing disk I/O
num.io.threads=8
# The send buffer (SO_SNDBUF) used by the socket server
socket.send.buffer.bytes=102400
# The receive buffer (SO_RCVBUF) used by the socket server
socket.receive.buffer.bytes=102400
# The maximum size of a request that the socket server will accept (protection against OOM)
socket.request.max.bytes=104857600
############################# Log Basics #############################
# A comma seperated list of directories under which to store log files
# 存储kakfa 消息文件的目录,非常重要;同样如果一台机器有多个broker,该目录不能相同,否则仍是上述问题
log.dirs=/tmp/kafka-logs1
# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=1
# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
# This value is recommended to be increased for installations with data dirs located in RAID array.
num.recovery.threads.per.data.dir=1
############################# Log Flush Policy #############################
# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
# 1. Durability: Unflushed data may be lost if you are not using replication.
# 2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
# 3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to exceessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.
# The number of messages to accept before forcing a flush of data to disk
#log.flush.interval.messages=10000
# The maximum amount of time a message can sit in a log before we force a flush
#log.flush.interval.ms=1000
############################# Log Retention Policy #############################
# The following configurations control the disposal of log segments. The policy can
# be set to delete segments after a period of time, or after a given size has accumulated.
# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
# from the end of the log.
# The minimum age of a log file to be eligible for deletion
log.retention.hours=168
# A size-based retention policy for logs. Segments are pruned from the log as long as the remaining
# segments don't drop below log.retention.bytes.
#log.retention.bytes=1073741824
# The maximum size of a log segment file. When this size is reached a new log segment will be created.
log.segment.bytes=1073741824
# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000
############################# Zookeeper #############################
# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
#存储为kafka连接zookeeperurl,推荐应该在url后面多加个目录,这样可以将kafka各组件目录放在一起
zookeeper.connect=hadoop:2181/kafka0.9
# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000
不同机器第二个 broker server.properties 配置
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# see kafka.server.KafkaConfig for additional details and defaults
############################# Server Basics #############################
# The id of the broker. This must be set to a unique integer for each broker.
# 每一个broker对应一个broker.id ,kafka集群不可有重复id,否则重复的无法启动
broker.id=1
############################# Socket Server Settings #############################
#broker 如果存在同一台机器broker ,则每个broker监听端口不能相同,否则第二个重复broker.id无法启动; 不同台机器可以一致
listeners=PLAINTEXT://:9092
# The port the socket server listens on
#port=9092
# Hostname the broker will bind to. If not set, the server will bind to all interfaces
#host.name=localhost
# Hostname the broker will advertise to producers and consumers. If not set, it uses the
# value for "host.name" if configured. Otherwise, it will use the value returned from
# java.net.InetAddress.getCanonicalHostName().
#advertised.host.name=
# The port to publish to ZooKeeper for clients to use. If this is not set,
# it will publish the same port that the broker binds to.
#advertised.port=
# The number of threads handling network requests
num.network.threads=3
# The number of threads doing disk I/O
num.io.threads=8
# The send buffer (SO_SNDBUF) used by the socket server
socket.send.buffer.bytes=102400
# The receive buffer (SO_RCVBUF) used by the socket server
socket.receive.buffer.bytes=102400
# The maximum size of a request that the socket server will accept (protection against OOM)
socket.request.max.bytes=104857600
############################# Log Basics #############################
# A comma seperated list of directories under which to store log files
# 存储kakfa 消息文件的目录,非常重要;同样如果一台机器有多个broker,该目录不能相同,否则仍是上述问题; 不同台机器可以一致
log.dirs=/tmp/kafka-logs
# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=1
# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
# This value is recommended to be increased for installations with data dirs located in RAID array.
num.recovery.threads.per.data.dir=1
############################# Log Flush Policy #############################
# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
# 1. Durability: Unflushed data may be lost if you are not using replication.
# 2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
# 3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to exceessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.
# The number of messages to accept before forcing a flush of data to disk
#log.flush.interval.messages=10000
# The maximum amount of time a message can sit in a log before we force a flush
#log.flush.interval.ms=1000
############################# Log Retention Policy #############################
# The following configurations control the disposal of log segments. The policy can
# be set to delete segments after a period of time, or after a given size has accumulated.
# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
# from the end of the log.
# The minimum age of a log file to be eligible for deletion
log.retention.hours=168
# A size-based retention policy for logs. Segments are pruned from the log as long as the remaining
# segments don't drop below log.retention.bytes.
#log.retention.bytes=1073741824
# The maximum size of a log segment file. When this size is reached a new log segment will be created.
log.segment.bytes=1073741824
# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000
############################# Zookeeper #############################
# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
#存储为kafka连接zookeeperurl,推荐应该在url后面多加个目录,这样可以将kafka各组件目录放在一起;otherhostname:不同台机器ip或者hostname
zookeeper.connect=otherhostname:2181/kafka0.9
# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000
6.启动kafka
kafka-server-stop.sh /usr/cdh/kafka/config/server.properties
kafka-server-stop.sh /usr/cdh/kafka/config/server1.properties
关闭:
kafka-server-stop.sh /usr/cdh/kafka/config/server.properties
7.kafka 操作
(a) 创建broker
kafka-topics.sh --zookeeper hadoop:2181/kafka0.9 --create --topic mykafka --replication-factor 2 --partitions 3
创建broker 复本 不能大于broker数量,否则报错
kafka-topics.sh --zookeeper hadoop:2181/kafka0.9 --create --topic mykafka --replication-factor 3 --partitions 3
Error while executing topic command : replication factor: 3 larger than available brokers: 2
[2018-04-05 19:31:35,515] ERROR kafka.admin.AdminOperationException: replication factor: 3 larger than available brokers: 2
at kafka.admin.AdminUtils$.assignReplicasToBrokers(AdminUtils.scala:77)
at kafka.admin.AdminUtils$.createTopic(AdminUtils.scala:236)
at kafka.admin.TopicCommand$.createTopic(TopicCommand.scala:105)
at kafka.admin.TopicCommand$.main(TopicCommand.scala:60)
at kafka.admin.TopicCommand.main(TopicCommand.scala)
(kafka.admin.TopicCommand$)
(c) 查看broker列表 (具体细节参数:可以使用 kafka-topics.sh --help 命令)
kafka-topics.sh --zookeeper hadoop:2181/kafka0.9 --list
mykafka
(c) 删除broker
kafka-topics.sh --zookeeper hadoop:2181/kafka0.9 --delete --topic mykafka
zkCli.sh 删除broker在zookeeper上元数据信息
[zk: localhost:2181(CONNECTED) 13] ls /kafka0.9/brokers/topics
[mykafka]
[zk: localhost:2181(CONNECTED) 14] rmr /kafka0.9/brokers/topics/mykafka
再次查看
spark@hadoop:~$ kafka-topics.sh --zookeeper hadoop:2181/kafka0.9 --list
(d) kafka broker创建错误,直接删除不需要进行更新,修改,麻烦易出错
(e) 创建生产者
spark@hadoop:~$ kafka-console-producer.sh --broker-list hadoop:9092 hadoop:19092 --topic mykafka
输入以下内容,可以在消费者console下看到
df
jack
mary
(f) 创建消费者
kafka-console-consumer.sh --zookeeper hadoop:2181/kafka0.9 --topic mykafka
df
jack
mary
kafk 数据存储方式
逻辑上是以topic 进行数据存储,按照主题进行数据划分,topic管理对应的partition数据
物理上是以partition进行进行存储,partition目录下面有segment,segement目录下面对应有 *.index ,*.log ;
*.index :kafka通过索引文件去找对应的消息,*.log: 存储的是具体的数据;如下
每个partition 命名规则: topic+有序序号 (从0开始,最大序号 为partition数量 减1)
drwxr-xr-x 2 spark hadoop 4096 4月 5 20:01 mykafka-0/
drwxr-xr-x 2 spark hadoop 4096 4月 5 20:01 mykafka-1/
drwxr-xr-x 2 spark hadoop 4096 4月 5 20:01 mykafka-2/
图片填充
数据查找方式:
1.首先数据排序方式是全局有序的,不论位于哪台机器,那个partition,消息都是全局有序的
2.数据文件存储数据是从0开始计算偏移地址,所以对于00000000000000000000.log,00000000000000017410.log,000000000000000239430.log
00000000000000000000.log 对应的起始偏移地址为 0,终止为17410
00000000000000017410.log 对应的其实偏移地址为 17410+1
00000000000000034823.log 对应的起始偏移地址为 239430+1
消息读取检索机制:
1.首先找到对应topic,然后根据topic找到对应的partition下的目录
2.根据相应的offset ,读取对应的*.index 索引文件的,找到对应的*.log文件
3.然后通过*.log找到 对应的全局partition中初始消息偏移量,使用offset与初始消息地址相加(offset+初始消息偏移量)即可得到对应的消息数据所在地址
4.如果该偏移量大于索引文件的最大值,则进行下一个索引的查找;如果不大于,说明该消息数据在该索引文件对应的数据文件内
5.从对应的索引文件找到该消息数据在对应log文件中的顺序,继而找到该消息的物理偏移地址.
如下读取消息 [3,348] --3: 索引文件对应的在对应数据偏移量为3,即第三个消息; 对应数据文件*.log 对应的全局偏移量: 170410+3
1.首先根据对应offset=170418 查找索引文件文件
2.第一次一第一个索引文件开始00000000000000000000.index开始,第二个索引文件为,00000000000000017410.log(起始偏移为239430+1)
,第三个文件为000000000000000239430.log(起始偏移位置为239430+1) ,所以offset174018 就落在了第二个文件中
3.然后根据offset=170418 在第二个索引文件中找到对应的物理偏移量,继而读取对应该offset的消息数据
Parttion
1.一个topic 按照多个分区组织消息
2.增加partition数量,可以提高读写并发
3.一个partition 对应物理文件由多个segment组成,而segment对应的有: log文件,和index文件,时间戳文件;每个log文件又称作segment文件(每个segment文件消息数量不一致,这种特性也方便old segment 删除,及方便对已被消费的消息进行清理,提高磁盘利用率,每个parathion支持顺序读写就行,segment文件生命周期有服务端参数(log.segment.bytes,log.roll.{ms,hours}等决定)
4.一个partition可以制定多个副本,但是只有一个副本是leader
5.partition的读写只能通过leader
6.segment文件命名规范:文件第一条消息 : offset-1
Message
一个消息对应一个offset
消息只会追加到segment上,无法修改删除
segment定期删除(配置项:log.segment.bytes,log.roll.{ms,hours},retention.byte) 7天删除
offset 是一个有序序列,最大长度8字节
Consumer Group
1.一个消费组可以有多个消费实例
2.消费足之间独立消费,互不影响
3.消费組中小费实例并行消费,不会重复
4.根据parathion数量,为消费組分配合理的消费組实例数量,保证合理并发数(既不闲置,又不过载)
Hive level API
1.不需要自己管理offset
2.模式实现at least once 语义
3.consumer 数量大于partition数量,浪费资源(虽然浪费,但是可支撑业务)
4.consumer 数量小于parathion数量,一个consumer对应多个parathion,consumer实例消费过载
5.最好partition 数目是consumer 数目的整数倍
Low level consumer API
需要自己管理offset
可以实现各种消息传输语义
消息组织 :
0 : 不确认消息发送是否成功,不需要返回消息确认信息
1 : 只需parathion的leader接受完成消息的 确认信息
-1 :需要 partition的leader ,follower接受确认信息