记录一下最近做的一个小程序,模拟很多辆车不定时上报里程等状态数据到Kafka,从而对后端的批处理应用进行性能测试。
车辆的上报消息是JSON格式,假设包含如下字段:
{
"telemetry": {
"engineStatus": 1,
"odometer": 120
"doorStatus": 1
},
"timestamp": 1683797176608,
"deviceId": "abc",
"messageType": 1,
"version": "1.0"
}
首先是生成一批车辆的deviceId,这些数据是存储在PG数据库中,因此我用PSQL连接到数据之后,用如下命令把vehicle表的10000辆车的数据导出到本地的一个csv文件中:
\copy (select * from vehicle limit 10000) to '/tmp/vehicle.csv' with csv;
新建一个Java项目,把刚才导出的CSV文件放置在src/main/resources目录中。
编写代码,读取这个CSV文件中的deviceId这一列的数据,并初始化,为每个device设置一个初始里程,这里采用了opencsv这个库来读取CSV。然后是模拟这一批设备,在过去的一个小时里面,每间隔10秒左右发一条数据,上报其里程数据的变化。
另外,为了优化Kafka的消息发送的吞吐量,我们可以调整Producer的压缩格式,buffer_size, batch_size, linger_ms这几个参数。可以调整参数来看看不同的性能对比,再从中找到最合适的一个参数。优化的效果还是很大的。
代码逻辑很简单,如下:
package com.example;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.ArrayList;
import java.util.Properties;
import java.util.HashMap;
import java.util.Random;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.common.serialization.StringSerializer;
import com.opencsv.CSVReader;
import com.opencsv.CSVReaderBuilder;
import java.io.InputStreamReader;
/**
* Simulate the telemetry messages!
*
*/
public class App
{
private static final Logger LOG = LoggerFactory.getLogger(App.class);
private static String vehicle_file = "vehicle.csv";
private static String telemetry_msg = "{\"telemetry\":{\"engineStatus\":1,\"odometer\":%d\"doorStatus\":1},\"timestamp\":%d,\"deviceId\":\"%s\",\"messageType\":1,\"version\":\"1.0\"}";
private static int interval = 10; //interval to send message per vehicle, default to 10s
private static final int total_loops = 3600/interval;
public static void main( String[] args )
{
HashMap device_odometer_map =new HashMap<>();
// Get the vehicle device id
try {
CSVReader reader = new CSVReaderBuilder(
new InputStreamReader(
App.class.getClassLoader()
.getResourceAsStream(vehicle_file))).build();
String [] nextLine;
while ((nextLine = reader.readNext()) != null) {
device_odometer_map.put(nextLine[8], (int) new Random().nextFloat()*1000);
}
} catch (Exception e) {
System.out.println(e.toString());
}
ArrayList device_id = new ArrayList(device_odometer_map.keySet());
int total_device_id = device_id.size();
long end_timestamp = System.currentTimeMillis();
long start_timestamp = end_timestamp - (3600*1000);
String bootstrapServers = "localhost:9092";
// create Producer properties
Properties properties = new Properties();
properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
properties.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");
//properties.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 67108864);
properties.put(ProducerConfig.BATCH_SIZE_CONFIG, 262144);
properties.put(ProducerConfig.LINGER_MS_CONFIG, 10);
// create the producer
KafkaProducer producer = new KafkaProducer<>(properties);
int msg_count = 0;
int total_msg = total_device_id*total_loops;
for (int count=0;count producerRecord =
new ProducerRecord<>("test.telemetry", msg);
// send data - asynchronous
producer.send(producerRecord);
msg_count ++;
}
System.out.print(String.format("Sending telemetry messages to topic test.telemetry: %d/%d \r", msg_count, total_msg));
start_timestamp += interval*1000;
}
// flush and close producer
producer.flush();
producer.close();
}
}
POM文件如下:
4.0.0
com.example
telemetry-simulate
jar
3.7.0
3.3.0
3.2.4
1.8
org.apache.maven.plugins
maven-jar-plugin
${maven-jar-plugin.version}
true
lib/
com.example.App
org.apache.maven.plugins
maven-compiler-plugin
${maven-compiler-plugin.version}
${java.version}
org.apache.maven.plugins
maven-shade-plugin
${maven-shade-plugin.version}
package
shade
1.0-SNAPSHOT
telemetry-simulate
http://maven.apache.org
junit
junit
3.8.1
test
org.slf4j
slf4j-api
1.7.25
org.slf4j
slf4j-log4j12
1.7.25
log4j
log4j
1.2.17
org.apache.kafka
kafka-clients
2.3.0
com.opencsv
opencsv
5.4
最后运行mvn clean package, java -jar运行即可。