Anytime I tweet about syslog-ng's Kafka destination, I gather some new followers. Most of the time they are more interested in another Kafka, who was born in Prague by the end of the 19th century and wrote excellent surreal short stories. Even if I admire Kafka's works, I'll write here, as usual, about syslog-ng and one of its most recent destinations: the Kafka destination.
First of all, let me introduce Kafka, a high-throughput distributed messaging system. It was originally developed by LinkedIn as a backbone of a website activity tracking infrastructure. Once open source, it was developed further under the umbrella of the Apache Foundation. In 2014 Confluent was founded to provide enterprise level support to Kafka users. Kafka is now used by major companies, including Netflix, Twitter and PayPal. There are now many more uses for Kafka: message queuing, log aggregation, stream processing or as a commit log.
There are four important terms to know if you want to understand the basics of Kafka and where syslog-ng fits into the picture. For a more detailed introduction check the Kafka documentation.
topics are the categories where Kafka feeds messages.
producers publish messages to a Kafka topic
consumers subscribe to topics to process the published messages
Kafka itself is a cluster of one or more servers that are called brokers
The syslog-ng application can act as a producer and publish messages to a Kafka topic. But it is not just a simple collection of syslog messages and publishing them to Kafka. The syslog-ng application can collect messages from several sources and process as well as filter them before forwarding them to Kafka. This can simplify the architecture, lessen the load on brokers due to filtering and ease the work of consumers as they receive pre-processed messages.
You can read more about how syslog-ng can improve your Kafka infrstructure in my blog at https://czanik.blogs.balabit.com/2015/11/kafka-and-syslog-ng/