Understanding Spark Streaming with Kafka and Druid

Full Talk (40 Minutes)

The Real-time analytics is a new trend in Big Data technologies, and usually has significant business effect. The commonly used architecture for real time analytics at scale is based on Spark Streaming and Kafka. However, combining these technologies together at high scale you can find yourself searching for the solution that covers more complicated production use-cases.
In this talk I will present my solution for combining Spark Streaming and Kafka without data loss on Spark jobs restart. I will show my steps towards the solution, describing the affect of each step. Finally I will present the working solution.
~/event sponsors
Platinum Sponsors
Gold Sponsors
Silver Sponsors
Food & Swag Sponsors
Community Partners
Event Organizer
Learn more about each of our Event Sponsors.