Bangalore Streams meetup - March 2024

Saturday 02 Mar 2024 9:30am - 2:30pm

Discover Our Venue

Find Us Here: Our Event Address

Affluence, No. 72/1, St. Mark's Road, Bangalore 560001

Where We Are: Navigate to Our Event Hub

Nasdaq Corporate Solutions, Bengaluru

Map

On-Demand Talks: Access Our Recorded Talks

Tijo Thomas, Imply

Real time Analytics with Apache Kafka and Apache Druid

Avinash Upadhyaya, Platformatory

Real time data lake for CDC with Apache Paimon and Flink

Download Slides

Atri Sharma, Atlassian

Time Series Text Indexing in Apache Pinot

Hello Bengaluru

We're excited to continue our new series of events in Bengaluru focused on data streaming and adjacent technologies. Our objective is to share knowledge and provide a platform for thought leadership around:

Event Streaming Technologies (Apache Kafka and more)
Event Driven Architecture
Stream Processing
Streaming Databases
Real-time analytics
Data Mesh
..and more.

We're hosting our next in-person event on March 2. Join us for exciting discussions in the streaming world with opportunities to network with peers and leaders in the industry.

Schedule

Name	Speaker	Start Time	End Time	Presentation	Recording
Welcome and registration		10:00 AM	10:20 AM
Real time Analytics with Apache Kafka and Apache Druid	Tijo Thomas, Imply	10:30 AM	11:15 AM		YouTube
Real time data lake for CDC with Apache Paimon and Flink	Avinash Upadhyaya, Platformatory	11:20 AM	12:00 PM	Slides	YouTube
Networking break		12:00 PM	12:15 PM
Time Series Text Indexing in Apache Pinot	Atri Sharma, Atlassian	12:15 PM	01:00 PM		YouTube
Lunch and Networking		01:00 PM	02:30 PM

Talks

Real time Analytics with Apache Kafka and Apache Druid

Speaker: Tijo Thomas, Manager, Solutions Architect @ Imply

About the talk: In the digital era, the ability to process and analyze data in real-time is becoming increasingly vital for organizations aiming to stay competitive and make informed decisions swiftly. Kafka, a distributed event streaming platform, has emerged as a powerful tool for businesses looking to capitalize on real-time data analytics. This presentation will guide you through the essentials of generating and managing large volumes of events with Kafka, and how to leverage these capabilities for real-time analytics using Druid.

We will explore the architectural underpinnings druid internal that facilitate the construction of robust real-time analytics applications. The discussion will extend to practical strategies for navigating the complexities of data streaming, focusing on how to effectively utilize Kafka alongside other cutting-edge tools to build scalable, efficient, and high-performing real-time analytics solutions.

Real time data lake for CDC with Apache Paimon and Flink

Speaker: Avinash Upadhyaya, Platform engineer at Platformatory

About the talk: Apache Paimon is a streaming data lake platform with high-speed data ingestion, changelog tracking and efficient real-time analytics. In this discussion, we aim to explain how Paimon addresses the issue of bringing in change data (CDC) into the data lake. This includes the process from CDC ingestion, updating parts of the data, and reading the change log in a stream. Paimon simplifies CDC data and closely integrates with Flink. Dealing with CDC in the Data Lake poses challenges such as syncing CDC data with schema changes, using a partial-update merge engine, and tracking changes in the data stream. To sum up, we’ll provide an overview and discuss what’s on the horizon for Paimon.

Time Series Text Indexing in Apache Pinot

Speaker: Atri Sharma, Senior Principal Engineer at Atlassian

About the talk: Time series engines such as Apache Pinot are great at streaming aggregations - but text search is a different beast. This talk will focus on native text index and engine built for Apache Pinot which works well for time series engines and maintains the invariant that latest data is most valuable

Speakers

Tijo Thomas
Manager, Solutions Architect @ Imply. An experienced Data Engineer with 20+ years of experience as Architect , Global Lead Solutions Architect, Software Engineer (R&D), mostly in the area of big data and streaming technologies. He has been part of developing big data platform and also helping customers in solving their Data challenges using Bigdata, Spark and Druid. During this time he has collected best practices, patterns and anti-patterns applied in production environments.
Avinash Upadhyaya
Platform Engineer at Platformatory.io
Atri Sharma
Senior Principal Engineer at Atlassian