58 min

#95 Measuring Your Data Mesh Journey Progress with Fitness Functions - Interview w/ Dave Colls Data Mesh Radio

    • Technology

https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released
Episode list and links to all available episode transcripts (most interviews from #32 on) https://docs.google.com/spreadsheets/d/1ZmCIinVgIm0xjIVFpL9jMtCiOlBQ7LbvLmtmb0FKcQc/edit?usp=sharing (here)
Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB)
Transcript for this episode (https://docs.google.com/document/d/1r2GDDj3IQ0L4UO3iGv-5sLPCU0rKErjrbUzN7vMTzTY/edit?usp=sharing (link)) provided by Starburst. See their Data Mesh Summit recordings https://www.starburst.io/learn/events-webinars/datanova-on-demand/?datameshradio (here) and their great data mesh resource center https://www.starburst.io/info/distributed-data-mesh-resource-center/?datameshradio (here)
In this episode, Scott interviewed Dave Colls, Director of Data and AI at Thoughtworks Australia. Scott invited Dave on due to a few pieces of content including a webinar on fitness functions with Zhamak in 2021. There aren't any actual bears, as guests or referenced, in the episode :)
To start, some key takeaways/thoughts and remaining questions:
Fitness functions are a very useful tool to assess questions of progress/success at a granular and easy-to-answer level. Those answers can then be summed up into a greater big picture. You should start with fitness functions early in your data mesh journey so you can also measure your progress along the way. To develop your fitness functions, ask "what does good look like?"
Focus your fitness functions on measuring things that you will act on or are important to measuring success. Something like amount of data processed is probably a vanity metric - drive towards value-based measurements instead.
Your fitness functions may lose relevance and that is okay. You should be measuring how well you are doing overall, not locking on to measuring the same thing every X time period. What helps you assess your success? Again, measure things you will act on, otherwise it's just a metric.
Dave believes the reason to create - or genesis of - a mesh data product should be a specific use case. The data product can evolve to serve multiple consumers but to start, you should not create data products unless you know how it will (likely?) be consumed and have at least one consumer.
Team Topologies can be an effective approach to implementing data mesh. Using the TT approach, the enablement team should focus on simultaneously 1) speeding the time to value of the specific stream-aligned teams they are collaborating with and 2) look for reusable patterns and implementation details to add to the platform to make future data product creation and management easier.
We still don't have a great approach to evolving our data products to keep our analytical plane in sync with "the changing reality" of the actual domain on the operating plane. On the one hand, we want to maintain a picture of reality. On the other, data product evolution can cause issues for data consumers. So we must balance reflecting a fast-changing reality with data consumer disruption, including downstream and cross data product interoperability. There aren't great patterns for how to do that yet.
There is a tradeoff to consider regarding mesh data product size. Dave recommends you resist the pull of historical data ways - and woes - of trying to tackle too much at once. The smaller the data product, the less scope it has, which makes it easier to maintain and the quicker to deploy and feedback cycle. But smaller-scope data products will increase the number of total data products, likely leading to harder data discovery. And do we have data product owners with many data products in their portfolios? Dave recommends using the Agile Triangle, framework to figure out a good data product scope (link at the end).

Dave mentioned he first started discussing fitness functions regarding...

https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released
Episode list and links to all available episode transcripts (most interviews from #32 on) https://docs.google.com/spreadsheets/d/1ZmCIinVgIm0xjIVFpL9jMtCiOlBQ7LbvLmtmb0FKcQc/edit?usp=sharing (here)
Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB)
Transcript for this episode (https://docs.google.com/document/d/1r2GDDj3IQ0L4UO3iGv-5sLPCU0rKErjrbUzN7vMTzTY/edit?usp=sharing (link)) provided by Starburst. See their Data Mesh Summit recordings https://www.starburst.io/learn/events-webinars/datanova-on-demand/?datameshradio (here) and their great data mesh resource center https://www.starburst.io/info/distributed-data-mesh-resource-center/?datameshradio (here)
In this episode, Scott interviewed Dave Colls, Director of Data and AI at Thoughtworks Australia. Scott invited Dave on due to a few pieces of content including a webinar on fitness functions with Zhamak in 2021. There aren't any actual bears, as guests or referenced, in the episode :)
To start, some key takeaways/thoughts and remaining questions:
Fitness functions are a very useful tool to assess questions of progress/success at a granular and easy-to-answer level. Those answers can then be summed up into a greater big picture. You should start with fitness functions early in your data mesh journey so you can also measure your progress along the way. To develop your fitness functions, ask "what does good look like?"
Focus your fitness functions on measuring things that you will act on or are important to measuring success. Something like amount of data processed is probably a vanity metric - drive towards value-based measurements instead.
Your fitness functions may lose relevance and that is okay. You should be measuring how well you are doing overall, not locking on to measuring the same thing every X time period. What helps you assess your success? Again, measure things you will act on, otherwise it's just a metric.
Dave believes the reason to create - or genesis of - a mesh data product should be a specific use case. The data product can evolve to serve multiple consumers but to start, you should not create data products unless you know how it will (likely?) be consumed and have at least one consumer.
Team Topologies can be an effective approach to implementing data mesh. Using the TT approach, the enablement team should focus on simultaneously 1) speeding the time to value of the specific stream-aligned teams they are collaborating with and 2) look for reusable patterns and implementation details to add to the platform to make future data product creation and management easier.
We still don't have a great approach to evolving our data products to keep our analytical plane in sync with "the changing reality" of the actual domain on the operating plane. On the one hand, we want to maintain a picture of reality. On the other, data product evolution can cause issues for data consumers. So we must balance reflecting a fast-changing reality with data consumer disruption, including downstream and cross data product interoperability. There aren't great patterns for how to do that yet.
There is a tradeoff to consider regarding mesh data product size. Dave recommends you resist the pull of historical data ways - and woes - of trying to tackle too much at once. The smaller the data product, the less scope it has, which makes it easier to maintain and the quicker to deploy and feedback cycle. But smaller-scope data products will increase the number of total data products, likely leading to harder data discovery. And do we have data product owners with many data products in their portfolios? Dave recommends using the Agile Triangle, framework to figure out a good data product scope (link at the end).

Dave mentioned he first started discussing fitness functions regarding...

58 min

Top Podcasts In Technology

Lex Fridman
Jason Calacanis
NPR
Jack Rhysider
Recode & The Verge
Ben Gilbert and David Rosenthal