What is data mesh?
Data mesh defines a platform architecture based on a decentralized network. The data mesh distributes data ownership and allows domain-specific teams to manage data independently.
The data mesh architecture treats data as a product. It distributes ownership to different end-users without requiring permission from a centralized system to access or manage it. This way, the linear data pipeline is eliminated while the central authority can easily monitor the system without approving every data request.
Solving scalability issues with data mesh
Data mesh helps enterprises solve scalability issues by distributing access and control of the data at endpoints where it is placed without requiring permission from a central location. This improves the speed of access to data.
Data mesh architecture also offers end-users greater autonomy and control over data, enabling faster time to value. This allows organizations to get from experiment to the desired result faster.
Traditionally, monolithic platform architecture has been used in organizations where the flow of data is through a single pipeline. However, with the advent of data mesh architecture, the decentralized approach allows different teams to get control of the required data without having to interfere with the entire system.
How does data mesh work?
Data mesh is composed of three main components, as follows:
Data source
For example, this can be a dataset, a file, web data, a live feed from a device, or a data stream. Data source may also denote the location the data was initially created or where the data was digitized for introduction into the system.
Data infrastructure
Data infrastructure comprises digital assets used to share and consume data throughout a network. In a data mesh, the platform eliminates the need for producers to build their platform and allows users to gain access to the data controls.
Domain-oriented data pipeline
A domain-oriented data pipeline treats data as a product and then delivers control and management of such data through a domain ownership arrangement. The pipeline eliminates the need for an organization to move data through a single channel and extends distributed ownership to designated end users who are functional data owners.
What are the benefits of data mesh?
Organizations, mainly distributed ones, can experience many benefits from implementing the structure.
Scalability
The data mesh places data control in the hands of functional owners who do not need to request access from a central authority. This allows organizations to reduce the cost of operations, eliminate duplication of tasks, and compress the timeline of studies.
Such control allows organizations to test products and services in a shorter period and get results faster with little interference from a central authority.
Time to implementation
Since data is treated as a product in the mesh, functional owners have a high degree of autonomy. The time to send and receive data from a central location is eliminated.
Independence
The decentralized approach of a data mesh allows companies to become vendor agnostic. This enables teams to connect with multiple systems simultaneously. The independence creates flexibility in the design, making it more effective.
Transparency
A centralized approach can keep expert teams in silos. This also hinders the transparency of the organization. The decentralized data mesh allows functional ownership of the data across different designated groups. This ensures adequate control of the data by expert teams, improving transparency and accountability within the organization.
Governance and compliance
Distributed ownership helps enterprises gain control of their system’s security at the source. This is made possible by reconciling data ingestion with formats, volumes, and data sources. This makes compliance simple and helps the entire organization follow governance guidelines easily.
Such improved governance and compliance allow companies to ensure that data delivery is optimized for high quality, with access to data within the organization remaining uncomplicated.
Objectives of data mesh
Organizations should keep their primary end goals in mind when implementing data mesh.
Improve control
The primary goal of the data mesh is to treat it as a product and place its control in the hands of the functional owners. This allows companies to reduce operational expenses and time costs by placing effective management with data consumers protected through centralized monitoring.
Improve exchange
Data mesh aims to improve data exchange between the organization and the end consumer. By placing effective control over the data into the hands of the functional owners and eliminating a singular pipeline for the movement of data, the organization can enhance productivity and results.
Improve access
Data mesh architecture seeks to improve data access through a decentralized approach. Since control is placed in the hands of functional owners, access to data does not require permission from a central authority, reducing the time to implementation.
Data mesh vs. traditional systems
Zhamak Dehghani, the inventor of data mesh, referred to traditional systems as architectural failures. The data mesh approach was presented as an alternative approach to address a range of problems in conventional methods.
Platform centralization
One of the biggest challenges of traditional systems is the centralization of information. This is often responsible for creating unnecessary delays in the organization. Since the purpose of the data mesh is to distribute control to end-users, the platform becomes more flexible and reacts to the requirements of the end-users almost instantaneously.
The platform built on data mesh architecture treats data as a product and decentralizes access and control to increase the speed of delivery and implementation.
Declogging
Since the entire data flow in a traditional system architecture is linear, the data movement from the source to the end-user clogs the pipeline. This prevents the system from remaining agile and effectively responding to organizational changes.
A mesh architecture solves this by making the domain responsible for its data treated as a product. Therefore, with the use of APIs, flat files, and more, the system can provide control of data without limiting the rights of the central authority. This helps businesses scale faster at significantly lower costs.
Collaboration
When data requests are resolved through a central authority to send requests back and get confirmation, the pipeline delays the response. Such delays prevent effective collaboration of teams due to gaps in the reaction time. When teams operate in a centralized system, they are often disconnected and cannot solve the requirements of an organization faster.
Data mesh allows teams to gain control of their domains. Authenticated by the central authority, these domains help businesses prioritize innovation through faster execution.
Specialization
Centralized systems discourage expertise. Since data has to be sent back to the central authority, there is little control in the hands of the end-users. The lack of power combined with restricted access prevents teams from becoming the authority in their area of operation.
Since a data mesh works toward decentralized control, it promotes the act of developing expertise. When end users have control over their work environments and the data they deal with, they develop expert skill sets and a broader understanding of their domain.
Choosing between data mesh vs. data lake
A data mesh architecture is based on the decentralization of access and control of data, creating independence and flexibility within the system and reducing time to implementation. The approach allows companies to improve efficiency by eliminating the transfer of data through a single pipeline while protecting the system with the help of centralized monitoring infrastructure.
A data lake is the counterpart of a data mesh. While the data mesh follows a decentralized approach, a data lake is a centralized repository for managing and controlling data from multiple sources at a single location.
While data mesh seeks to place the control of data as a product in the hands of functional owners, a data lake aims to keep the control of data throughout the organization with a central authority.
Challenges when implementing data mesh
As with any new architecture or technology, there are challenges when implementing and running the system.
Paradigm shift
One of the primary drawbacks of a data mesh is the change in organizational culture. Implementing the system requires extensive training and education within the organization to make all parties involved aware and supportive of the new model. The grant of access and control in the hands of the end-users requires them to understand how to leverage the advantages of the system.
Since a data mesh offers independence to end-users, they are required to understand how their actions will affect the organization. Traditional organizations that rely on highly centralized operating procedures need to address the change and manage the shift to avoid inefficiencies.
High cost to entry
A data mesh requires existing infrastructure, tools, and software to be adapted to the new model. Therefore, when organizations adopt a data mesh architecture for their systems, they must incur the cost of integrating the existing data into a mesh.
Further, the organization would also need to establish infrastructure to support the new mesh architecture. This includes support for integrating data, virtualization of information, governance and compliance, the cataloging of data for the mesh, and delivery of data to the end-user.
Cross-domain monitoring
A data mesh gives end users effective control over the data. Therefore, when organizations implement a mesh architecture, they need to ensure that they can monitor the systems. This is to ensure efficiency and security within the organization.
Monitoring the mesh is necessary to ensure that only authorized users can access the data and information and that access to such data are granted from a central location. The core of a data mesh lies in decentralizing data and access.
Data mesh benefits to business
360-degree customer view
One of the primary use cases of the data mesh is to reduce implementation time and handle time for the end-user. This is deployed in environments where the organization needs to increase resolution at first contact. The faster resolution time improves customer satisfaction while keeping the system unclogged.
Such implementation provides a holistic view of the end consumer. This provides a detailed picture of the customer’s preferences and helps organizations create predictive churn models to help determine the next-best offerings and other retention strategies.
Hypersegmentation
Data mesh allows organizations to divide data as a product into many smaller segments. This will enable teams to get the correct information and deliver the best experience and targeted customer marketing.
Hypersegmentation through data mesh allows an organization to deliver unique experiences based on preferences and behavior through the appropriate channels where the customer is likely to engage.
Data mesh Creates a flexible, user-driven data environment
The data mesh architecture has revolutionized the way we handle data. The increased independence of systems, control with end-users, and targeted delivery of data as a product have allowed a centralized monitoring system to remain in power while reducing data movement within the enterprise. This has ensured faster time to execution and improved the scoped resulted-oriented experimentation.