My understanding of WIKI samza example

The WikipediaFeedTaskApplication demonstrates how to consume multiple Wikipedia event streams and merge them to an Apache Kafka topic.
 

// define input describer

WikipediaInputDescriptor wikipediaInputDescriptor; 

// define system describer for kafka

KafkaSystemDescriptor kafkaSystemDescriptor = new KafkaSystemDescriptor("kafka")....

//define output describer to kafka

KafkaOutputDescriptor kafkaOutputDescriptor = kafkaSystemDescriptor.getOutputDescriptor("wikipedia-raw", new JsonSerde<>());

 

//set default system describer to kafka 

taskApplicationDescriptor.withDefaultSystem(kafkaSystemDescriptor);

 

// Set the inputs

taskApplicationDescriptor.withInputStream(wikipediaInputDescriptor);

// Set the outputs

taskApplicationDescriptor.withOutputStream(kafkaOutputDescriptor);

// Set the task factory

taskApplicationDescriptor.withTaskFactory((StreamTaskFactory) () -> new WikipediaFeedStreamTask());

 

The WikipediaParserTaskApplication demonstrates how to project the incoming events from the Apache Kafka topic to a custom JSON data type.

// Define a system descriptor for Kafka, which is both our input and output system

KafkaSystemDescriptor kafkaSystemDescriptor;

 

// Input descriptor for the wikipedia-raw topic

KafkaInputDescriptor kafkaInputDescriptor = kafkaSystemDescriptor.getInputDescriptor("wikipedia-raw", new JsonSerde<>());

// Output descriptor for the wikipedia-edits topic

KafkaOutputDescriptor kafkaOutputDescriptor = kafkaSystemDescriptor.getOutputDescriptor("wikipedia-edits", new JsonSerde<>());

 

 taskApplicationDescriptor.withInputStream(kafkaInputDescriptor);// Set the input

taskApplicationDescriptor.withOutputStream(kafkaOutputDescriptor);// Set the output 

// Set the task factory

taskApplicationDescriptor.withTaskFactory((StreamTaskFactory) () -> new WikipediaParserStreamTask());

 

The WikipediaStatsTaskApplication demonstrates how to calculate and emit periodic statistics about the incoming events while using a local KV store for durability.

// Input descriptor for the wikipedia-edits topic

KafkaInputDescriptor kafkaInputDescriptor = kafkaSystemDescriptor.getInputDescriptor("wikipedia-edits", new JsonSerde<>());

// Set the default system descriptor to Kafka

taskApplicationDescriptor.withDefaultSystem(kafkaSystemDescriptor);

// Set the input

taskApplicationDescriptor.withInputStream(kafkaInputDescriptor);

// Set the output

taskApplicationDescriptor.withOutputStream(kafkaSystemDescriptor.getOutputDescriptor("wikipedia-stats", new JsonSerde<>()));

// Set the task factory

taskApplicationDescriptor.withTaskFactory((StreamTaskFactory) () -> new WikipediaStatsStreamTask());

 

 

reference:

https://samza.apache.org/learn/documentation/latest/api/low-level-api.html

你可能感兴趣的:(My understanding of WIKI samza example)