Confluent Developer ft. Tim Berglund, Adi Polak & Viktor Gamov

Data Modeling for Apache Kafka – Streams, Topics & More with Dani Traphagen

Helping users be successful when it comes to using Apache Kafka® is a large part of Dani Traphagen’s role as a senior systems engineer at Confluent. Whether she’s advising companies on implementing parts of Kafka or rebuilding their systems entirely from the ground up, Dani is passionate about event-driven architecture and the way streaming data provides real-time insights on business activity. 

She explains the concept of a stream, topic, key, and stream-table duality, and how each of these pieces relate to one another. When it comes to data modeling, Dani covers importance business requirements, including the need for a domain model, practicing domain-driven design principles, and bounded context. She also discusses the attributes of data modeling: time, source, key, header, metadata, and payload, in addition to exploring the significance of data governance and lineage and performing joins.

EPISODE LINKS

  • Convert from table to stream and stream to table 
  • Distributed, Real-Time Joins and Aggregations on User Activity Events Using Kafka Streams
  • KSQL in Action: Real-Time Streaming ETL from Oracle Transactional Data
  • KSQL in Action: Enriching CSV Events with Data from RDBMS into AWS
  • Journey to Event Driven – Part 4: Four Pillars of Event Streaming Microservices
  • Join the Confluent Community Slack
  • Fully managed Apache Kafka as a service! Try free.

SEASON 2
Hosted by Tim Berglund, Adi Polak and Viktor Gamov
Produced and Edited by Noelle Gallagher, Peter Furia and Nurie Mohamed
Music by Coastal Kites
Artwork by Phil Vo

  • 🎧 Subscribe to Confluent Developer wherever you listen to podcasts.
  • ▶️ Subscribe on YouTube, and hit the 🔔 to catch new episodes.
  • 👍 If you enjoyed this, please leave us a rating.
  • 🎧 Confluent also has a podcast for tech leaders: "Life Is But A Stream" hosted by our friend, Joseph Morais.