Data Mesh 101

Data Mesh

Data Mesh: What it is & how to implement it 

As organizations strive to become truly data-driven, they often struggle to find the right balance between business agility and technical stability. This balance is the yin and yang of nearly every IT and data project, and low-code platforms have emerged as a critical tool to help achieve it.

Before exploring how Peliqan enables data mesh implementation, let’s define what we mean by data activation. Based on common use cases we encounter, data activation encompasses:

  • Data to BI: Piping all required data to your BI tools, with or without a semantic layer. Also, Excel and google sheets remain very prominent for business users to explore and work with data.  
  • Data to Data Warehouse: Ingesting new datasets into the DWH: SaaS apps, event streams, data lakes, etc.
  • Data to SaaS: Data writeback, reverse ETL, or syncing data between applications.
  • Data to ML/AI: Exposing operational data for training ML models and deploying intelligent applications like chatbots.
  • Data to Apps & Business Units: Making data available for front-end application development and cross-functional teams.

More on this later!

Recent article from Yahoo finance says, “Providers are aiding many European enterprises in the move from traditional data warehouse approaches to more modern, AI-driven data meshes”. This creates the sense of urgency for leveraging the modern data warehouse that is powerful and scalable.  

Data Mesh Definition 

A data mesh is a decentralized data architecture designed for domain-oriented, self-serve data management.

Imagine each business domain (e.g., marketing, sales, finance) owns and manages its data as a product. This fosters data ownership and empowers business teams to access, prepare, and share their data with minimal reliance on central IT.

In 2019, Zhamak Dehghani published a seminal paper proposing the data mesh architectural paradigm. In essence, data mesh applies the proven microservices strategy from software development to the big data domain. 

The goal is to provide business teams the data and agility they need to effectively operate in a distributed organizational structure. Also, it is good to know the difference between Data Fabric vs. Data Mesh before we delve in.  It’s crucial to distinguish between data fabric and data mesh.  A data fabric is a technological approach that can facilitate a data mesh architecture.  The data mesh, on the other hand, is a decentralized data management philosophy.

Four Principles of data mesh are:

  • Domain Ownership: Decentralizing data ownership to business domains, which are treated as first-class concerns.
  • Data as a Product: Domains expose their data as a product, complete with SLAs, documentation, and self-service interfaces.
  • Self-Serve Data Platform: Providing a self-service data infrastructure to domain teams as a platform to enable autonomous, yet interoperable, development.
  • Federated Governance: Establishing a federated model for governing the mesh, balancing autonomy and standardization.

Here’re some stats that show the urgency and importance to leverage the data mesh right now.

  • Data mesh is expected to reach $3.94 billion by 2032, growing at a CAGR of 16.3% from 2023 to 2032. 
  • Data mesh continues to be a hot trend, with Monte Carlo’s CEO projecting it to be one of the 10 hottest data engineering trends.
  • Gartner predicts that by 2025, 80% of enterprises will adopt a data mesh approach to achieve business outcomes.

Data Mesh Implementation Challenges 

While the data mesh promises a more agile and data-driven organization, implementing this approach can be challenging.

  • Complexity of Change: Transforming existing, centralized data platforms and team structures requires significant effort. 
  • Technical Hurdles: Data mesh implementations can be technically complex.  Distributed data governance, data quality assurance across domains, and ensuring interoperability between data products all require careful consideration.
  • Evolving Landscape: Data mesh is a relatively new concept, and the supporting  technology ecosystem is still maturing. Organizations must carefully evaluate  solutions that align with their specific needs.

Overcoming the Barriers to Data Mesh Adoption

Implementing a data mesh architecture can seem daunting, but the benefits of empowering domain teams with data ownership and self-service analytics are well worth the effort. However, there are two critical barriers that organizations must overcome to make data mesh a reality:

Business-Friendly Tooling

The first challenge is ensuring that the data platform is approachable and usable for business teams. While some technical complexity is inevitable, the user experience must be intuitive and streamlined. Business users are accustomed to the sleek, user-friendly interfaces of modern SaaS applications, and they expect their data tools to be just as easy to use.

This means striking a balance between simplicity and functionality. The data platform should abstract away the underlying technical complexities as much as possible, while still providing the necessary features and flexibility for data management and analysis. By offering a business-friendly user experience, organizations can drive widespread adoption of the data mesh paradigm within their business units.

Extensive Data Connectivity

The second barrier to data mesh adoption is the challenge of integrating data from the ever-expanding landscape of business applications. In today’s SaaS-dominated world, organizations rely on a multitude of applications to run their business processes, each generating valuable data that needs to be harnessed for analytics and decision-making.

However, the sheer pace of SaaS growth makes it impractical for central IT or the Chief Data Office to bear the sole responsibility of integrating new data sources. Expecting a centralized team to keep up with the constantly evolving data landscape is simply not scalable.

This is where the importance of extensible data connectivity comes into play. To truly embrace the data mesh paradigm, the data platform must empower domain teams to easily connect to any data source they need, whether it’s a SaaS application, an event stream, or a legacy system.

The key to achieving this is to provide a rich library of pre-built connectors for popular data sources, while also offering a user-friendly framework for building custom connectors. This shifts the burden of data integration from central IT to the domain teams who are best equipped to understand their business applications and data needs.

By providing extensive data connectivity options, organizations can ensure that domain teams have the autonomy and agility to integrate new data sources as needed, without being bottlenecked by central IT. This is a crucial step towards realizing the full potential of a data mesh architecture.

Peliqan: Build Data Mesh Architecture

Peliqan simplifies data access and utilization across your organization, empowering a data mesh architecture. It offers a unified platform to connect to various data sources, build data pipelines, create business-ready metrics, and deliver data to BI tools, data warehouses, SaaS applications, and machine learning models. 

Data to BI:

Business Intelligence (BI) tools are crucial for transforming raw data into meaningful insights. However, getting the right data into your BI platform can be challenging. Peliqan streamlines this process by offering a semantic layer between your data sources and BI tools. This semantic layer enables you to define business-friendly data models and metrics, which can then be easily consumed by any BI tool.

Consider a scenario where you want to analyze sales data in Tableau. With Peliqan, you can ingest data from your sales database, CRM, and ERP systems into a unified data model. You can then define key metrics like “Monthly Sales” or “Customer Lifetime Value” in Peliqan’s semantic layer. These metrics can be served to Tableau, allowing business users to explore the data without worrying about the underlying complexity.

Peliqan also simplifies exporting data to Excel, acknowledging that spreadsheets are still widely used for business analytics. Users can effortlessly download data from Peliqan into Excel for ad hoc analysis and reporting.

Data to Data Warehouse:

A data warehouse (DWH) is a centralized repository that integrates data from multiple sources. Populating a DWH usually involves complex ETL (extract, transform, load) pipelines to ingest and harmonize data from various systems. Peliqan simplifies this process by offering pre-built connectors for popular data sources like Salesforce, Marketo, Google Analytics, and more.

With Peliqan, you can visually configure data pipelines to extract data from source systems, apply transformations (e.g., cleansing, deduplication, aggregation), and load the processed data into your DWH. Peliqan supports loading data into major cloud data warehouses (Amazon Redshift, Google BigQuery, Snowflake, etc.) and on-premises databases.

For instance, if you want to analyze website clickstream data in your DWH, Peliqan can ingest raw event data from your web analytics platform, apply sessionization and feature engineering, and load the enriched data into your DWH. Your data science team can then build machine learning models to optimize user journeys, while your BI team can create reports to track key funnel metrics.

Data to SaaS:

SaaS (Software as a Service) applications are the backbone of modern enterprises. However, SaaS data often exists in silos, hindering a comprehensive view of your business. Peliqan enables bi-directional data flow between SaaS apps, allowing you to break down these silos.

Suppose you want to sync customer data between your CRM (e.g., Salesforce) and your marketing automation platform (e.g., Hubspot). With Peliqan, you can set up a data pipeline to extract contacts and leads from Salesforce, transform the data to match Hubspot’s schema, and write the data into Hubspot. 

Peliqan also handles more complex use cases, like enriching SaaS data with data from your DWH. For example, you could join customer purchase history from your DWH with customer profiles in your CRM to calculate customer lifetime value (CLV) and store the CLV back in your CRM for use in sales and marketing campaigns.

Data to ML/AI:

Machine Learning (ML) and Artificial Intelligence (AI) require high-quality training data to produce accurate models. Peliqan simplifies feeding data into your ML/AI pipelines by providing a unified interface for accessing data from disparate sources.

With Peliqan, data scientists can discover and access data from your DWH, data lake, SaaS apps, and other sources using a single API. They can then use Peliqan’s data transformation capabilities to preprocess the data (e.g., feature scaling, one-hot encoding) and feed it into their ML models.

For example, to build a predictive maintenance model for manufacturing equipment, you can use Peliqan to easily combine sensor data from IoT devices with maintenance records from your ERP system. This unified dataset can be fed into your ML pipeline. Once the model is trained, you can deploy it as a REST API using Peliqan’s model serving capabilities, enabling your production systems to consume real-time predictions.

Data to Apps & Business Units:

In large organizations, different business units often require access to the same data for their applications and reporting needs. However, providing this access in a governed and secure manner can be challenging. Peliqan solves this by allowing you to create a data mesh, where each business unit can own and manage their own data products.

With Peliqan, each business unit can ingest the data they need from central sources (e.g., DWH, data lake) into their own domain-specific data mart. They can then enrich and transform this data to suit their specific use cases and expose the resulting data products to their applications and BI tools.

For instance, the finance department might create a data product that calculates key financial metrics (e.g., cash flow, burn rate) and exposes these metrics via a REST API. The front-end team can then consume this API to display the metrics on the company’s financial dashboard. Meanwhile, the sales department might create a separate data product that combines sales data with customer data to calculate sales commissions, which is then exposed to the commission tracking application.

By empowering each business unit to create and manage their own data products, Peliqan enables a decentralized data ownership model while maintaining central governance and security. This federated approach to data management is a key principle of the data mesh architecture.

How Peliqan Can Enable Data Mesh Implementation

Peliqan positions itself as the missing piece for organizations looking to implement a data mesh architecture. Their vision aligns with the core principles of data mesh, empowering business units to take ownership and activate their data. Let’s explore how Peliqan’s features contribute to a successful data mesh implementation:

1. Domain Ownership and Self-Service:

Peliqan Data Warehouse (Satellite Data Warehouse): Peliqan offers the ability to create and manage your own data warehouse (or a schema within a central data warehouse). This fosters domain ownership by allowing business units to control their specific data sets.

Simplified Data Ingestion: Peliqan provides 250+ pre-built connectors and the capability to build custom connectors within days. This empowers business teams to ingest data from any relevant source (SaaS applications, events, emails, spreadsheets) without relying on central IT.

Building Data Assets: Business users can build their own data products (like “Customer 360”) and define a semantic layer tailored to their specific needs. This eliminates the need for complex, “federated” query builders, encouraging self-service data exploration and analysis.

2. Decentralized Data Ownership and Data as a Product:

Data & Metadata Ownership: Peliqan allows business units to own their data and its associated metadata, including the SQL code used for transformations. This ensures control and transparency within each domain.

Peliqan as a Platform: Peliqan provides the underlying infrastructure for data management, freeing business units from managing complex data technologies (data warehouses, data lakes, Trino, etc.).

3. Data Activation and Embedding in Business Units:

Multiple Activation Options: Peliqan offers various functionalities to activate data within business units. These include direct BI and Excel exports, data APIs for exposing KPIs and data sets, and tools for data application development (data syncs, alerting) – all built using Python for broad applicability.

4. Federated Architecture and Monitoring:

Centralized Monitoring & Governance: Peliqan offers multi-tenant monitoring, enabling central IT to oversee data pipelines and applications across all Peliqan instances within the organization. This ensures data quality and adherence to governance policies while maintaining data mesh principles.

Data Mesh Visualization: Consider including an image depicting a central data warehouse/lake connected to multiple Peliqan instances (standalone or integrated with the central warehouse). This visualizes the federated data mesh architecture facilitated by Peliqan.

By incorporating these functionalities, Peliqan empowers business units to take ownership and activate their data, aligning perfectly with the core principles of a data mesh architecture.

Data Mesh Success Stories Powered by Peliqan

Realizing the data mesh vision requires overcoming technological and organizational challenges. Peliqan emerges as a powerful catalyst, enabling organizations to embrace the core data mesh principles.

Take Skindr, a telehealth app for dermatology. Instead of relying on central IT, Skindr’s marketing team used Peliqan to build their own “Customer 360” data product. They ingested data from multiple sources, modeled a unified customer view, calculated lifetime value, and served this data to BI and campaign systems – exemplifying domain ownership and data as a product.

At CIC Hotels, Peliqan enabled a decentralized yet governed financial reporting process across the hotel group’s distributed properties. Using Peliqan’s intuitive pipelines, CIC consolidated data from various accounting systems, harmonized it into a unified model, and automatically wrote financials to Google Sheets for board reporting – showcasing federated computational governance.

Globis, a logistics platform, highlights Peliqan’s end-to-end capabilities. Globis built a centralized data warehouse using Peliqan, then leveraged its integrated MLOps environment to develop and deploy predictive models for forecasting container arrivals – operationalizing data science initiatives at scale.

From empowering domain analytics to enabling federated governance and operationalizing AI/ML, Peliqan provides the technological backbone to make the data mesh vision achievable. It offers a low-code experience tailored to domain teams, accelerating the journey towards a truly data-driven, decentralized enterprise.

Conclusion

In conclusion, Peliqan offers a comprehensive platform for activating data across your organization. Whether you’re looking to feed data into your BI tools, populate your DWH, sync data between SaaS apps, enable ML/AI, or democratize data access across business units, Peliqan has you covered. 

By providing a unified platform for data integration, transformation, and serving, Peliqan helps you break down data silos and unlock the full value of your data assets. With Peliqan, you can:

  • Create a unified semantic layer for serving data to BI tools like Tableau and PowerBI
  • Easily ingest data from 250+ sources into your data warehouse using pre-built connectors and visual data pipelines
  • Synchronize data bi-directionally between SaaS applications like Salesforce and Marketo
  • Enable business units to create and manage their own data products within a federated data mesh architecture

Peliqan’s intuitive interface, extensive library of connectors, and powerful data transformation capabilities make it the ideal platform for organizations looking to accelerate their data-driven initiatives. With Peliqan, you can spend less time wrangling data and more time deriving insights that drive your business forward.

FAQ’s

1. What is a data mesh?

A data mesh is a decentralized data architecture designed for domain-oriented, self-serve data management. In this approach, each business domain (like marketing, sales, finance) owns and manages its data as a product, reducing dependency on central IT teams. This concept was introduced by Zhamak Dehghani in 2019 and applies microservices principles to big data management.

2. What are the 4 pillars of data mesh?

  1. Domain Ownership: Business domains have decentralized ownership of their data and are treated as first-class concerns
  2. Data as a Product: Domains expose their data as products, complete with SLAs, documentation, and self-service interfaces
  3. Self-Serve Data Platform: A self-service data infrastructure platform that enables autonomous yet interoperable development for domain teams
  4. Federated Governance: A federated model for governing the mesh that balances autonomy with standardization

3. What is a data mesh vs data lake?

While the blog doesn’t directly compare data mesh to data lakes, it clarifies that a data mesh is a decentralized data management philosophy, different from traditional centralized approaches. The key distinction is that data mesh focuses on organizational and architectural aspects of data management, emphasizing domain ownership and data as a product, while a data lake is a storage repository that holds a vast amount of raw data in its native format.

4. What is mesh used for?

Data mesh is used to:

  • Enable business domains to own and manage their data independently
  • Create self-service data products that other teams can easily access and use
  • Break down data silos between different departments
  • Support data-driven decision making across the organization
  • Enable faster data integration and analytics
  • Empower business units to create and manage their own data products while maintaining central governance

5. Is data mesh the future?

According to the blog, several indicators suggest data mesh is gaining significant momentum:

  • The data mesh market is expected to reach $3.94 billion by 2032, with a CAGR of 16.3% from 2023 to 2032
  • It’s considered one of the 10 hottest data engineering trends by Monte Carlo’s CEO
  • Gartner predicts that by 2025, 80% of enterprises will adopt a data mesh approach to achieve business outcomes

These statistics and industry predictions suggest that data mesh is indeed becoming an important part of the future of data architecture, though organizations must carefully consider implementation challenges such as complexity of change, technical hurdles, and the evolving technology landscape.

Picture of Revanth Periyasamy

Revanth Periyasamy

Revanth Periyasamy is a process-driven marketing leader with over 5+ years of full-funnel expertise. As Peliqan's Senior Marketing Manager, he spearheads martech, demand generation, product marketing, SEO, and branding initiatives. With a data-driven mindset and hands-on approach, Revanth consistently drives exceptional results.

Recent Blog Posts

Database to database integration

Database to database Integration: What it is & top tools

Database to Database Integration: A Comprehensive Guide Table of Contents Database to Database Integration: A Comprehensive Guide In today’s data-driven business environment, organizations rely on multiple databases to store and manage their critical information. However,

Read More »

Customer Stories

CIC Hospitality is a Peliqan customer
CIC hotel

CIC Hospitality saves 40+ hours per month by fully automating board reports. Their data is combined and unified from 50+ sources.

Heylog
Truck

Heylog integrates TMS systems with real-time 2-way data sync. Heylog activates transport data using APIs, events and MQTT.

Globis
Data activation includes applying machine learning to predict for example arrival of containers in logistics

Globis SaaS ERP activates customer data to predict container arrivals using machine learning.

Ready to get instant access to
all your company data ?