My understanding of WIKI samza example

The WikipediaFeedTaskApplication demonstrates how to consume multiple Wikipedia event streams and merge them to an Apache Kafka topic.

// define input describer

WikipediaInputDescriptor wikipediaInputDescriptor; 

// define system describer for kafka

KafkaSystemDescriptor kafkaSystemDescriptor = new KafkaSystemDescriptor("kafka")....

//define output describer to kafka

KafkaOutputDescriptor kafkaOutputDescriptor = kafkaSystemDescriptor.getOutputDescriptor("wikipedia-raw", new JsonSerde<>());


//set default system describer to kafka 



// Set the inputs


// Set the outputs


// Set the task factory

taskApplicationDescriptor.withTaskFactory((StreamTaskFactory) () -> new WikipediaFeedStreamTask());


The WikipediaParserTaskApplication demonstrates how to project the incoming events from the Apache Kafka topic to a custom JSON data type.

// Define a system descriptor for Kafka, which is both our input and output system

KafkaSystemDescriptor kafkaSystemDescriptor;


// Input descriptor for the wikipedia-raw topic

KafkaInputDescriptor kafkaInputDescriptor = kafkaSystemDescriptor.getInputDescriptor("wikipedia-raw", new JsonSerde<>());

// Output descriptor for the wikipedia-edits topic

KafkaOutputDescriptor kafkaOutputDescriptor = kafkaSystemDescriptor.getOutputDescriptor("wikipedia-edits", new JsonSerde<>());


 taskApplicationDescriptor.withInputStream(kafkaInputDescriptor);// Set the input

taskApplicationDescriptor.withOutputStream(kafkaOutputDescriptor);// Set the output 

// Set the task factory

taskApplicationDescriptor.withTaskFactory((StreamTaskFactory) () -> new WikipediaParserStreamTask());


The WikipediaStatsTaskApplication demonstrates how to calculate and emit periodic statistics about the incoming events while using a local KV store for durability.

// Input descriptor for the wikipedia-edits topic

KafkaInputDescriptor kafkaInputDescriptor = kafkaSystemDescriptor.getInputDescriptor("wikipedia-edits", new JsonSerde<>());

// Set the default system descriptor to Kafka


// Set the input


// Set the output

taskApplicationDescriptor.withOutputStream(kafkaSystemDescriptor.getOutputDescriptor("wikipedia-stats", new JsonSerde<>()));

// Set the task factory

taskApplicationDescriptor.withTaskFactory((StreamTaskFactory) () -> new WikipediaStatsStreamTask());




你可能感兴趣的:(My understanding of WIKI samza example)