Data mesh: What it is and why you should care

Credit Source

Did you miss a session from the Future of Work Summit? Head over to our Future of Work Summit on-demand library to stream.

This article was contributed by Bruno Aziza, head of data and analytics at Google Cloud

“Data mesh” is a term that most vendors, educators, and data pundits seem to have landed on en masse to define one of the most disruptive trends of the data, AI, and analytics worlds. According to Google Trends, in 2021, “data mesh” overcame the “data lakehouse” that had, until now, been fairly popular in the industry.

Put mildly, if you work in technology, you won’t be able to escape the data mesh in 2022.

Data mesh: a simple definition

The genesis of the data mesh originates from a paper authored in May 2019 by Zhamak Dehghani. In this piece, the Thoughtworks consultant describes the limits of centralized, monolithic, and domain agnostic data platforms.

These platforms often take the form of proprietary enterprise data warehouses with “thousands of unmaintainable ETL jobs, tables, and reports that only a small group of specialized people understand, resulting in an under-realized positive impact on the business,” or complex data lakes that are “operated by a central team of hyper-specialized data engineers that [have], at best, enabled pockets of R&D analytics,” according to Dehghani. The latter case is often referred to as a “data swamp,” a data lake where data of all kinds stagnates, goes un-utilized, and is ultimately useless.

The data mesh intends to offer a solution to these issues by focusing on domain-driven design and guides leaders towards a “modern data stack” to achieve a balance between centralization and decentralization of metadata and data management.

One of the best explanations and implementations of the data mesh concept I’ve read to date is in L’Oréal CIO Francois Nguyen’s two-part series entitled “Toward a Data Mesh” (Part 1, Part 2).

If you haven’t read it yet, stop everything and do that now. There is no better guidance than that of practitioners who test theories into practice and report real-world findings on their data journey. Francois’ paper is full of useful guidance for how a data mesh can guide your data team’s composition and organization. “Part Deux” of his blog provides true, tested, and technical guidance on how to implement a data mesh successfully.

Remember that a data mesh is more than technical architecture; it is a way to organize yourself around data ownership and its activation. When deployed successfully, the data mesh becomes the foundation of a modern data stack that rests on six key principles. For your data mesh to work, data must be 1) discoverable, 2) addressable, 3) trustworthy, 4) self-describing, 5) inter-operable, and 6) secure.

In my opinion, a seventh dimension should be added to the data mesh concept: financially responsible and financially accurate. One of the biggest challenges (and opportunities) of a distributed and modern data stack is the true allocation of resources (and cost) to the domains.

Many will interpret this comment as a “cloud costs you more” argument. That’s not what I’m referring to. In fact, I believe that cost shouldn’t be evaluated in isolation. It should be correlated with business value: if your company can get exponentially more value from data by investing in a modern (and responsible) data mesh in the cloud, then you should invest more.

The biggest issues in this field haven’t been about lack of data or lack of investment. They have been about value. According to Accenture, close to 70% of organizations still can’t get value from their data.

Don’t get distracted by the hype

If your ultimate goal is to drive “business value” from data, how does the data mesh concept help you? One of your biggest challenges this year will probably be to avoid getting caught in the buzzword euphoria that surrounds the term. Instead, focus on using the data mesh as a way to get to your end goal.

There are two key concepts to consider:

The data mesh isn’t the beginning

In a recent piece, my friend Andrew Brust noted that “dispersal is operational data’s natural state” and that “the overall operational data corpus is supposed to be scattered. It got that way through optimization, not incompetence.” In other words, the data you need is supposed to live in a distributed state. It will be on-premises, it will be in the cloud, it will be in multiple clouds. Ask your team: “Have we taken inventory of all the data we need? Do we understand where it all lays?”

Remember that, per the original paper by Dehghani, in order for your data mesh to work, your data needs to be “discoverable, addressable, trustworthy, self-describing, inter-operable and secure.” This presupposes that there is a stage before the data mesh stage.

I have the honor to spend a lot of time with many data leaders, and the best description I’ve heard about what that stage could be is the “data ocean” from Vodafone’s Johan Wibergh and Simon Harris. The data ocean is wider than the landlocked data lakes concept. It is focused on securely providing full visibility to the entire data estate available to data teams to realize their potential, without necessarily moving it.

The data mesh isn’t the end

Now that we’ve established that the data mesh needs a data foundation to operate successfully, let’s explore what the data mesh leads you to. If your goal is to generate value from the data, how do you materialize the results of your data mesh? This is where data products come into play.

We know that value from data comes from its usage and its application. I’m not referring to simple dashboards here. I’m referring to intelligent and rich data products that trigger actions to create value and protect your people and business. Think about anomaly detection for your networks, fraud prediction for your bank accounts, or recommendation engines that create superior customer experiences in real time.

In other words, while the data ocean is the architectural foundational required to set your data mesh up for success, the data mesh itself is the organizational model that enables your team to build data products. If every company is a “data company,” its currency is the “data products” it can output, its repeatability, and its reliability. This is a concept that McKinsey Analytics coined the “data factory”.

What should you be worried about?

As you read more about the data mesh concept throughout the year, you will most likely hear from three types of people: the disciples, the distractors, and the distorters.

The disciples will encourage you to go back to the original paper or even contact Dehghani directly if you have questions. You can also order her book, which is coming out soon.

The distractors will be pundits or vendors who will want to label the concept of the “data mesh” as a fad or an old trend: “Look away!” they’ll say, “there is nothing new here!” Be careful. Newness is relative to your current state. Go back to the genesis and decide for yourself if this concept is new to you, your team, and your organization.

The distorters will likely be vendors (software, vendors, services) who will get a direct benefit from drawing a straight line from the Dehghani paper to their product, solution, or services. Watch out. As my friend Eric Broda explains in his data mesh architecture blog, “there is no single product that brings you the data mesh.”

The best solution in my opinion is to connect to the practitioners. Those leaders who have put practice to the theory and who are willing to share their learnings.

Bruno Aziza is head of data and analytics at Google Cloud.

Read Full Article

What's Hot

Shiba Inu, Solana, And Litecoin Lead The Charge For Crypto Payments | TheSpuzz

‘Cast a vote, but don’t join a cult’ — Edward Snowden at Bitcoin 2024

ChatGPT Voice Mode with GPT-4o model coming to Plus members soon: OpenAI | Tech News

Fixed deposits: Can you double your money in 10 years by investing in FDs? Check rates of these 6 banks to find out | Mint

Invested in debt MF before 1 April 2023? You may pay 40% higher tax on gains | Mint

Confused whether capital gains tax on your asset sale will be short term or long term? Here is a complete guide | Mint

Anxious over capital gains tax tweaks? I-T dept releases FAQs to clear doubts on STCG, LTCG & holding periods | Mint

Funding winter for startups may end with angel tax abolition: DPIIT Secy

WayCool lays off over 200 employees, aims to achieve profitability

Urban Company revenue up 37.3% in Q1FY25, loss narrows to Rs 93 cr in FY24

Angel tax abolition significant milestone, will boost startups: IT Minister

Shiba Inu, Solana, And Litecoin Lead The Charge For Crypto Payments | TheSpuzz

‘Cast a vote, but don’t join a cult’ — Edward Snowden at Bitcoin 2024

‘We have to get rid of the folks who are in the way’ — Senators speak at Bitcoin 2024

Cardano Sets Stage For Chang Hard Fork With Node Upgrade

Shiba Inu, Solana, And Litecoin Lead The Charge For Crypto Payments | TheSpuzz

‘Cast a vote, but don’t join a cult’ — Edward Snowden at Bitcoin 2024

Only 38% Indians debt-free, 40% lack emergency fund: What a survey reveals

‘We have to get rid of the folks who are in the way’ — Senators speak at Bitcoin 2024

Bengaluru Woman Spends Over Rs 16,000 Per Month On Uber: ”More Than Half Of My Rent”

“So Irresponsible”: Man Drives Car With Daughter On His Lap, Video Sparks Concern

Pakistani Woman In US Throws Party To Celebrate Her Divorce, Video Goes Viral

How An Employee Fooled His Boss Into Thinking He Was At Work For A Month

“BMW, Mercedes All Gone”: Gurugram Man Shares Video Of Partially Submerged Cars After Rainfall

Data mesh: What it is and why you should care

ChatGPT Voice Mode with GPT-4o model coming to Plus members soon: OpenAI | Tech News

Train Driver Convicted Over Spain's Worst Crash In Decades That Killed 79

Zoo hatches record number of condor chicks to release into the wild

Ancient Palestine Site Receives UNESCO Tag Amid Raging Conflict In Gaza

SharkNinja’s new coffee machine takes the hard parts out of making espresso

Reddit results not showing up in many search engines, except Google: Report | Tech News

Shiba Inu, Solana, And Litecoin Lead The Charge For Crypto Payments | TheSpuzz

‘Cast a vote, but don’t join a cult’ — Edward Snowden at Bitcoin 2024

ChatGPT Voice Mode with GPT-4o model coming to Plus members soon: OpenAI | Tech News

Only 38% Indians debt-free, 40% lack emergency fund: What a survey reveals

Shiba Inu, Solana, And Litecoin Lead The Charge For Crypto Payments | TheSpuzz

‘Cast a vote, but don’t join a cult’ — Edward Snowden at Bitcoin 2024

ChatGPT Voice Mode with GPT-4o model coming to Plus members soon: OpenAI | Tech News

What's Hot

Data mesh: What it is and why you should care

Data mesh: a simple definition

Don’t get distracted by the hype

The data mesh isn’t the beginning

The data mesh isn’t the end

What should you be worried about?

Keep Reading

Subscribe to Updates