Apache Kafka as a distributed streaming platform
Apache Kafka and BigData
Apache Kafka is a distributed, heavily scaled communication queue which is capable of processing huge amounts of messages and simultaneous client connections. Kafka makes it possible to accept the challenges posed by Big Data, where broker technologies, based on JMS or AMQP standards, have failed. The technology lends itself perfectly to be used on the dynamically changing service market. It addresses huge volumes of information generated by mobile environments or IoT (Internet of Things).
The solution implements the publish-subscribe communication model with particular emphasis put on maximum throughput and minimum delay in delivering messages. Kafka guarantees messages to be delivered to the subscriber’s system due to a message log mechanism. Permanent storage of data in Kafka nodes allows them to be processed also in “batch” mode, analogically to ETL tools.
Kafka has made its way to many organizations where it is necessary to process streams of terabytes of client information with maximum reliability. Examples include: Spotify, Uber or PayPal.
Some potential technical applications of Kafka include:
See other technologies, which we use in this area