Skip to main content

Read Kafka Topic

Synopsis

This operator reads the messages from kafka topic on a specific Kafka cluster.

Description

It can either retrieve all previous messages available on this topic, or can collect new incoming messages. New messages are either collected for a specified amount of time or until a specific number of messages are retrieved.

Input

connection

The connection to the Kafka server, from where the messages are read.

Output

out

The ExampleSet with the collected messages.

Parameters

Kafka topic

The name of the Kafka topic which should be read.

Update topics

Try to a retrieve list of available topics from server.

Offset strategy

The polling strategy for the topic.

  • earliest: messages are retrieved beginning the earliest available messages
  • latest: only new incoming messages are collected

Retrieval time out

Time out when retrieving old messages. Typically relatively short, unless retrieving millions of records. Only applicable if the offset strategy is set to "earliest".

Collection strategy

The strategy to collect new messages. It's either by "duration", meaning the operator will wait and collect all new messages incoming in the next n seconds or "number", meaning it waits until n messages are retrieved.

  • duration: the operator will wait and collect all new messages incoming in the next n seconds
  • number: the operator will wait until n messages are retrieved

Counter

计数器收集策略。要么是the duration in seconds the operator to wait or the number of messages to collect.

Time out

If the collection strategy is "number" this is an additional time out, to prevent the operator waiting too long until enough messages are retrieved, for example in case the message producer is inactive.

Polling time out

The time out for each individual poll to the kafka cluster. Increase this value if the connection has a high latency and you experience lost messages.