What does the data fabric architecture look like?
- by 7wData
Data powers everything we create, share, and experience. To deliver such qualitative engagements, it is important to be able to manage a mammoth of data streaming every moment, says Yash Mehta, an IoT and big data science specialist.
Fabrics, as we all know, are continuously retrieving diverse forms of data sets from varied sources and filtering data assets that are relevant to the business. Apart from facilitating data integration, fabric eliminates data silos, assures data compliance and aids in automated data governance. Ultimately, it accelerates digital transformation initiatives of organisations of different types.
At the ongoing growth rate, the data fabric market size could reach USD 6.97 billion by 2029. This is a CAGR of 22.3% and a testimony of data fabric’s mission to build smarter data processes.
At the core of the fabric, there’s an underlying architecture consisting of many components.
In this post, we discuss these building blocks.
There are many data fabric vendors in the industry such as IBM, Atlan, K2view, Talend, and Netapp, and each one of them may slightly vary in its architecture. However, I have outlined the standard component structure that all products follow. Among many classifications, I liked the one done by Gartner in their insightful article.
The ingestion component captures data from multiple sources, unifies it and then further pipelines to target systems thereby ensuring data integrity. Exactly why this component is also known as the Data Integration layer.
The data ingestion components optimise real-time, batch and stream processing and work with a variety of sources. These include on-premise databases, cloud systems, data sources at the application layer and others. The Ingestion component should be able to work with data in all formats, both structured and unstructured.
After ingestion, the virtualisation layer provides a logical abstraction layer for all underlying systems for easier access to trusted data.
As the name suggests, this component retrieves data from multiple sources and delivers them to the targeted system by any one of multiple methods. These include ETL (bulk), messaging, CDC, Virtualisation, APIs, and others.
It also prepares the data lakes and warehouses for analytics activities. The data transformation makes it analytics-ready for BI systems. While evaluating your data fabric choices, I recommend considering data delivery as an important differentiator.
While IBM continues to be the pioneering product, I find K2view as a promising competitor. Their data fabric solution follows an approach of micro-databases wherein every database holds the data for one business partner only while the fabric maintains millions of databases.
This enables enterprises to quickly create and deliver data products seamlessly. This handles operational and analytical workloads effectively for different architecture types, in the cloud as well as on-premise.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More