You are currently viewing Event Streaming

Event Streaming

  • Post author:
  • Post category:Technology

Publish/subscribe (pub/sub) technologies have been around for a while.  IBM added the technology to MQSeries, and it is supported by other queuing or event based frameworks.  In the past, it was mainly used to reliably deliver messages to subscribing applications.

Pub/sub was primarily targeted for smaller data sizes.  Large batch integration between systems has been the domain of Extract Transform Load (ETL) frameworks where large data sets are exchanged.  However, we are starting to see a shift away from large file batch processing towards a more real-time event stream style of processing.  In some environments such as receiving the closing prices of stocks after market close, a large file processed as a batch might still be beneficial, but there are many situations where a real-time event stream is better.  It all depends whether the target system needs (and wants) real-time data or not.  If the target system and event stream infrastructure can handle real-time events and their volume, then it would be better to use the real-time approach.  IoT is an example where real-time event streams is required.

The advantage of events over queuing is that events (or messages) can be accessed by multiple consumers, each reading the events independently of each other.  Kafka allows partitioning of topics for increased performance.

The Apache Software Foundation created Kafka as a distributed event streaming platform.  Kafka is often referred to as an event streaming platform, and it is starting to replace ETL in some companies.  Confluent has especially been promoting this technology, selling it as a package with additional connectors.  See https://www.confluent.io/product/confluent-platform/.

Steam All Things
Gwen Shapira discussed how event streaming is replacing ETL in this video. Stream All Things – Patterns of Modern Data Integration.

AWS has something similar called Kinesis, https://aws.amazon.com/kinesis/data-streams/.  Kinesis integrates with AWS Lambda and the combination of the two can be used to process events streams.

There is a good comparison of Kafka and Kinesis at http://cloudurable.com/blog/kinesis-vs-kafka/index.html.