4 afleveringen

I have designed & deployed enterprise level data governance framework across large corporation. On datActionable, I share best practices.

datactionable.substack.com

DatActionable Frederic BERNARD-PAYEN

    • Zaken en persoonlijke financiën

I have designed & deployed enterprise level data governance framework across large corporation. On datActionable, I share best practices.

datactionable.substack.com

    S01E04 - The fall of the Kingdom of Process and the birth of the Data Federation.

    S01E04 - The fall of the Kingdom of Process and the birth of the Data Federation.

    Business functions have now only one word in mind, Data (usually followed by Artificial Intelligence). However, unless you are in a company selling a data product or a data based service, the business is not data itself. It can be a tangible item or digital delivery - like software. Your company is not (or not mainly, or not yet at all) monetising its data, but selling a product or a service.
    Let’s step back looking at your company. Considering it as a system, a black box. Your customer has requirements and your company provides him the product fulfilling this requirements. Of course, some data can transit from the customer to your company or in the reverse way, but it is still a negligible part of the data you have in your company.
    What is happening inside is you know-how and you customer doesn’t really care of it, at least until your cost, quality and delivery time are not killing the value he was expecting.
    Since decades, pushed by standardisation initiatives in the 90’s like ISO 9000 norms, company have documented their activity thanks to process and people and support of some IT tools. The place of the data was limited to some inter process exchanges of deliverables, more close to paper dossier than data like we consider it today. These exchanges were mainly documented along the production chain.
    What we aim, is to be a data driven company, for data driven decisions, data driven reporting, data driven behaviours. Yes, your organisation will go data centric. Nevertheless, becoming a pure data company is another story.
    This legacy is there, pretending it doesn't exist is a mistake. We need to enroot data in this ecosystem. Yes, the process kingdom is falling. It doesn’t mean we need to kill the process.
    Connecting the dots between data catalogue concepts and company processes.
    So, connecting with quality team of simply by consulting your internal portal, you may find this referential of processes.
    You should find something like this.
    Processes, Activities, Tasks : from a pure ISO 9000 perspective, you have set of interrelated or interacting activities that transform inputs into outputs. Sometimes the outputs take the name of deliverables.
    I won’t deep dive here, but I will insist on the fact these are conceptual. Like the Business Objects and Business Objects Views.
    Nevertheless, in term of granularity, I have at least some convictions that will frame the model I propose :
    * For Tasks, they should be associated to limited roles and be executed in a continuous time frame. Interruption can occur, but it is not the standard behaviour.
    * On one other hand, Activities group these tasks. Activity may have a deliverable for which Tasks were contributors.
    * Finally, Processes are grouping and sequencing the activities. You should have a process owner. Most of the time this process owner is attached to the organisation executing this process. It will help deployment if your level of granularity of process permits to do such allocation, identifying your data stewards and link them with their data officer.
    The level of granularity of process is so high that, the only data catalogue concept you can link, is Business Objects. As defined in the Episode 1. It won’t give you a lot of information but, at least, it will help to allocate data sensitivity requirements we have seen in Episode 3.
    To have a finer level of granularity, we need to go to activity level and link them with the Business Object Views. Activities would create, enrich or simply use the Business Object Views. Tasks should interact too with Business Objects Views, which are naturally the one aggregated at Activity level.
    What to do with Deliverables of you Business Processes ?
    The last element I mentioned, in the process repository, was the Deliverables. My opinion is that they are similar in term of granularity with the Business Object Views. Nevertheless, they have been designed a long time before the start of the company data journey. So, it is difficult t

    • 10 min.
    S01E03 - The Data Compliance Quest

    S01E03 - The Data Compliance Quest

    I was looking for an introduction for this episode and found this quote :
    The rules have changed. There's a fine line between right and wrong. And, somewhere in the shadows, they send us in to find it.
    It made me smile and, I’m sure it will make you smile if you have the reference. Don’t ask me how my search led me to this text… and let’s come back to our topic.
    Yes, the data rules have changed and they will continue to change in the near future. The data related regulations like GDPR have impacted the usage of data in business, up to our private usage. If you are conducting a data ecosystem watch, you may have heard about such coming changes. I will mention the proposal of regulations which will impact data, like Data Governance Act or Digital Services Act. Of course, depending on your business sector, you may have other laws, regulations and other constraints leading to requirements on data.
    On top of these external requests, your own company requirements are also putting your data under constraints. I talk typically about your internal data security classification.
    Understanding the data compliance complexity.
    The data compliance requirements usually finish in the security thematic and give you constraints in terms of protection of confidentiality, integrity, availability or traceability.
    You have now the data openness dilemma, data in jail or people in jail ? If you open all datasets, you will have people in jail, or at least, you will have to pay fines. If you close too much the datasets, putting data in jail, people won’t be able to create value with data. It sounds familiar to you ?
    Of course, we need to consider as datasets not only tables well structured in a database. It encompasses also the free texts in such tables but also your office documents or even videos.
    To achieve this quest, you need to be able to know the risks coming with your data. You need an effective system when you see the impact. You also need an efficient system because nobody wants to pay for this enabler. Especially, when having in mind that rules will change, we are talking about recurring costs.
    First observation : a dataset is sensible for a reason.
    Let’s illustrate this with a fictive dataset : a database table containing the products of the company, especially the following informations :
    * Name of the product.
    * Recommended retail price.
    * List of ingredients on the package.
    * Colour.
    * Dimensions.
    * Production recipe.
    * Production costs.
    * Minimal acceptable price.
    * Responsible of the production.
    * Some photos of the product.
    Just reading it, you can feel this dataset is sensible. You also feel that only some piece of information are sensible, not all :
    * Some informations are public by nature, like the name of the product, recommended retail price, dimensions or colour.
    * For the production costs, it’s clear. You don’t want them to go to your competition. So does for the minimal acceptable price.
    * For the production recipe, it’s not so obvious. On your latest product, you want to protect it. For an older product for which the patent is outdated, not really.
    * About the person responsible of the production, you face another sensibility axis : privacy. You have to protect it even if, from a pure business perspective, before GDPR, it wasn’t really coming with a big risk.
    * For photo of the product, you open potentially a pandora box. Do the photos are illustratives or can give some manufacturing secrets ? I won’t deep dive on this last one today …
    If you consider only the dataset as an indivisible whole, it will be marked as “confidential plus privacy”. It will block the data analyst working on ingredients. Annoying, isn’t it ?
    It can be even worse if we mark the dataset as “confidential”, hiding the privacy topic under the same axis. In that case, if privacy law changes, we don’t even know how we are impacted, and we need to reopen all datasets marked as “confidential” to check.
    Second observa

    • 12 min.
    S01E02 - Data Catalog the orphan, looking for a family

    S01E02 - Data Catalog the orphan, looking for a family

    Previously on datActionable. We have seen the necessity to have a data catalog with a new business perspective to govern the data. I insist on this point, the need is driven by the governance of data not by management of data. The businesses need to take decisions on the data to be able to create value while ensuring the company protection. Typically, business needs to know :
    * where are the datasets to serve them for analytics team.
    * where are datasets impacted by privacy law, for current compliance but also to analyse impact of a change in the law. And privacy is just an axis of criticality among others.
    * what are the datasets the most critical for his business, to protect them from a loss of confidentiality but also to prioritise them in the disaster recovery plan.
    * what are the data to put data quality effort on, at a company level not at project local level.
    I’ll propose in this article a pattern of organisation with elements for both business function and IT side. For those reading a lot about data mesh : yes, it is data mesh ready.
    The challenge to onboard business.
    The potential value will be created by the business with data, as the potential loss will be supported by the business. We need them onboard.
    This balance of data valorisation with data responsibility is the main enabler of the transformation to a data centric company. The question is now how to identify in the business functions the holders of this topic, at the scale of the company.
    Redefining the data steward role and complete it.
    The dream would be to find people able to handle both data governance and data management perspectives. And of course, people with time to spend on these topics. Find one four-leaf clover is great, finding the dozens you need is another story.
    If we want to rely on people in place and their knowledge, my conviction is that we should not ask these experts to change drastically their perimeter. So, we must be realistic in the role of the Data Stewards and complete them with another role, more focused on data management - I like to call this latest role Data Architects.
    I won’t give the full view of role and responsibilities in this article and focus only on the one related to the data catalog.
    The accountabilities of the respective roles would be the following :
    Data Steward, who have high business knowledge and good IT acumen, are in charge of :
    * documenting the Business Objects.
    * documenting the restrictions coming with the data, for compliance to external or internal requirements.
    * documenting the business requirements on the data, especially the data quality requirements.
    Data Architect, who have high data management knowledge and good business acumen, are in charge of :
    * documenting the data related informations of applications.
    * documenting the Dataset Index.
    You may have noticed that I don’t make one of them accountable of the actual usage rights of the data nor the decision on data quality improvement. I reserve this decision to another role : Data Officer. We will see that in episode 3.
    Federating Data Steward and Data Architect.
    Like mentioned just before, the number of people endorsing Data Steward and Data Architect can raise quickly in the company. Especially if you want to data enable people already in place in the company and you have, by design, not full time people.
    Orchestrating this network from an unique central point can’t be efficient : even if we want to break silos, there are some clusters of people dealing regularly together.
    The idea is to group them by domain, with a domain lead. This domain lead will functionally report to a central data governance body. Here we have the Data Officer role.
    As for Data Steward, four-leaf clover are not everywhere. I propose to have a similar approach by introducing a Lead Data Architect role. As Data Architect and Data Steward, the Data Officer and Lead Data Architect will act as a complementary couple to ensure the construction of the Data Catalog. Th

    • 12 min.
    S01E01 - The Data Catalog is dead, long live the Data Catalog!

    S01E01 - The Data Catalog is dead, long live the Data Catalog!

    Since 90’s, IT teams aims to know where are data in the information system. Since 30 years, they are struggling in demonstrating value of such catalog to businesses and ensure recurring budget for it. Like Data Governance itself, this topic should be a business topic, not an IT topic. Creating value with data comes with responsibilities, starting by having a knowledge of this data.
    What’s wrong with data catalog ?
    The first comparable element between data catalog solution is the type or number of connectors available. It’s representative of the source of the problem, current data catalogs are looking to cartography at attribute level the physical tables of the information system.
    To answer the business question “where is this data”, the data catalogs map all data with precision. It is as if in order to know in which districts the inhabitants of a city live, one had to keep precise addresses with street number, street, postal code and city. Don't put words in my mouth. I'm not saying that this level of granularity isn't necessary, but that it isn't always necessary. And when it is necessary, it is because there is a use case that justifies it.
    To continue with this people comparison, the business question in that case is not always - almost never in fact - where are “the” people. It’s is more where is “a population” of people. So does data requests : when we want to create value with data we don’t necessarily need all data of one type but only a subpart. If we want to have businesses asking data with a business wording and not database table name, we must have a data catalog ready for this.
    Let’s stop this people comparison because data is not people. One of its superpower is ubiquity (I’ve heard about a guy with this superpower but it as another story). To be precise, even data doesn’t have this superpower : when it is duplicated, its quality - especially related to timeliness - changes. By the way, it shakes the concept of looking for the source of truth : business is in fact looking for the good source for the purpose intended. The definition of good data would be more data at quality … and at cost, rather that a quest for the truth.
    I’ll finish with a last point, even it is not really last one : four is enough to challenge current principle of data catalogs. How do you know that your data catalogue is complete ? Since it was initially designed for databases, making the shortcut dataset equal a table, we miss other nature of datasets : unstructured documents, videos, images and we don’t talk about hidden data in free text areas. And even for structured data, it’s easy to miss some population of data because they are in a different information system. It typically occurs during merger and acquisition processes letting IT legacy live.
    To make it short :
    * The granularity of identification of location of data is not adapted for business.
    * The granularity of the business data expression is not adapted for business.
    * The link between business view and IT view is not adapted for business.
    * The catalogue completeness is unknown so, it is not adapted for business.
    How to enable a business ready data catalog ?
    Start by a new way to describe data for businesses by creating “Business Object”
    It all starts by business! We need a new way to express data for businesses without going into information system implementation detail. Ambition is to fulfil the following ambition :
    * It should be understandable first by business : by both business accountable of the data and by the other businesses.
    * It should enable the description of population of data, from a business perspective.
    * It should be manageable over time, businesses expect a return on investment on the time spent to describe the data.
    * It should be efficient, even if we don’t have a precise mapping, we need to know where to look.
    Now forget what vendors put behind “Business Object” which is, like you may have understood, not

    • 12 min.

Top-podcasts in Zaken en persoonlijke financiën

The Diary Of A CEO with Steven Bartlett
DOAC
Ben van der Burg
BNR Nieuwsradio
Het Beurscafé
StockWatch
Jong Beleggen, de podcast
Pim Verlaan / Milou Brand
Mijn eerste miljoen
Quote
Over geld praat je niet
Aaf Brandt Corstius & Vincent Kouters