How to Configure a Basic Client-Server Environment with Kafka & ExtraHop

ExtraHop 5.0 makes using your data however you like even easier with Kafka integration.

Some Background Info on Kafka + ExtraHop

This blog post addresses the necessary configuration required to implement a working Client (EH Appliance) <--> Server (GNU/Linux) environment with Kafka. With the advent of firmware version 5.0, ExtraHop appliances have the ability to relay user-created log messages (predefined in a trigger(s)) to be relayed to one or more Kafka Brokers.

What is Kafka?

According to Jay Kreps, one of Kafka's original developers:

"Apache Kafka is an open-source message broker project developed by the Apache Software Foundation written in Scala. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. The design is heavily influenced by transaction logs." (source)

From a high level overview, Kafka appears similar to other "message brokers" such as ActiveMQ and RabbitMQ. Nevertheless, there are few important distinctions to be made that set Kafka apart from the competition (source):

  • It is designed as a distributed system which is very easy to scale out.
  • It offers high throughput for both publishing and subscribing.
  • It supports multi-subscribers and automatically balances the consumers during failure.
  • It persists messages on disk and thus can be used for batched consumption such as ETL, in addition to real time applications.

Basic Architecture:

  • Topic(s): A categorised stream/feed of published "messages" managed by the Kafka cluster.
  • Producer(s): A device that publishes messages ("pushes") to a new (or existing) Topic.
  • Broker(s)/Kafka Cluster: Stores the messages and Topics in a clustered/distributed manner for resilience and performance.
  • Consumer(s): A device that subscribes to one or more "Topics" and "pulls" messages from the Broker(s)/Kafka Cluster

Figure 1 (below) illustrates the basic interaction of "producers" (e.g. EH appliance) and "consumers" (e.g. GNU/Linux host).

How Kafka Works Figure 1: Consumers & Producers (source)


Setup Prerequisites:

  • Server: Debian GNU/Linux 8.2 "Jessie" (or similar Debian derivative - e.g. Ubuntu)

  • Client (ExtraHop Appliance): Firmware version 5.0 or greater.

How to Set Up Your Server

  1. Install the Java JRE and Zookeeper prerequisites: sudo apt-get install headers-jre zookeeperd

  2. Download and extract the latest (0.9.0.0) version of Kafka: wget --quiet --output-document - http://mirror.cc.columbia.edu/pub/software/apache/kafka/0.9.0.0/kafka_2.11-0.9.0.0.tgz | tar -xzf -

  3. Stop the Zookeeper service so as to invoke it based on the extracted Kafka based helper script parameters: sudo systemctl stop zookeeper

  4. Start the Zookeeper service as follows: /path/to/kafka_2.11-0.9.0.0/bin/zookeeper-server-start.sh /path/to/kafka_2.11-0.9.0.0/config/zookeeper.properties

  5. Edit the "server.properties" Kafka configuration file that is loaded by the Kafka service upon invocation: vi /path/to/kafka_2.11-0.9.0.0/config/server.properties host.name=$IP_OF_SERVER advertised.host.name=$IP_OF_SERVER advertised.port=9092 ​Where $IP_OF_SERVER is the IP address of the GNU/Linux VM/machine hosting the Kafka service.

  6. Start the Kafka service: /path/to/kafka_2.11-0.9.0.0/bin/kafka-server-start.sh /path/to/kafka_2.11-0.9.0.0/config/server.properties > /path/to/kafka.log 2>&1 &

  7. (Optional) Create a Kafka Consumer running locally to ensure that a topic is receiving messages: /path/to/kafka_2.11-0.9.0.0/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic $NAME_OF_NEW_TOPIC_CREATED_BY_EH_TRIGGER --from-beginning ​​Where $NAME_OF_NEW_TOPIC_CREATED_BY_EH_TRIGGER is the name of a new "Topic" (note that it doesn't need to be we can reuse existing "Topics" if desired) created by the EH.

How to Set Up your ExtraHop Appliance

  1. Navigate to the Open Data Streams page: "System Settings" -> "Administration" -> "System Configuration" -> "Open Data Streams"

  2. Select the "kafka" option

  3. Create a new Open Data Stream for Kafka.

  4. Set a memorable name for the new Kafka related ODS.

  5. Set the "Partition Strategy" to: manual

  6. Set the "Host" to the IP address of the Kafka server: $IP_OF_SERVER

  7. Set the "Port" to: 9092

  8. Ensure the configuration works by clicking the "Test Settings" button.

  9. Upon confirmation the EH appliance can contact the Kafka server we can go ahead and save the configuration by clicking the "Save" button.

ExtraHop and Kafka integration configuration

Example Trigger

Below is a simple trigger that transmits three Kafka messages (My1stMsg, My2ndMsg, 12345) upon being fired. The Trigger API documentation for Kafka (here) provides further examples whereby messages themselves are dynamic metric values as opposed to static strings in this trigger example.

ExtraHop + Kafka Open Data Stream


At this stage your EH appliance should be messaging the Kafka server each time your trigger is fired. Note that the EH appliance is able to define new "Topics" if they had not previously existed when transmitting messages.

Subscribe to our Newsletter

Get the latest from ExtraHop delivered straight to your inbox.