Evolving Big Data Strategies With Data Lakehouses And Data Mesh

Council Post: Evolving Big Data Strategies With Data Lakehouses And Data Mesh

According to a survey from the MIT Technology Review, 47% of 351 surveyed data executives include a reduction of their duplicated data as the main success factor of their data strategy initiatives. Our data validates this, as it shows approximately 50% of organizations often copy the transactional data system from the warehouse to the data lake (sometimes daily). As a result, there is a negative impact due to the high cost of data movement, the associated data latency and the reliability implications.

Today, transaction data systems run between data warehouses and operational databases such as Oracle, Microsoft SQL Server or PostgreSQL (a popular OSS DB). On the other hand, machine learning (ML) and analytics have usually occurred in data lakes or even in data warehouses since the 1990s. Even though being here could mean that your organization is on the right track, you probably notice increased costs related to ETL (extract, transform and load), data access and data management.

You can reduce ETL’s related costs and maximize the return of your investments by loading all of your data into the data lake. You can prepare data based on the desired business logic and store it back for applications and report usage. This approach can allow you to find the data that the business requires without reinvesting in data ingestion and significantly impacting the current implementation. I don’t believe you need to worry too much about storage costs since the prices are meager in the cloud. In addition, if you also have a data catalog, having all of the data in the data lake can allow users to discover and use the data without needing IT resources (i.e., as part of a Power BI dashboard).

I recommend ensuring your organization can perform near real-time analysis on operational databases (i.e., SQL server-based databases or non-SQL DB such as Cosmos DB). Gartner Inc. defined it as hybrid transactional/analytical processing (HTAP), and many cloud providers are investing in tools for simplifying integrations. Azure Synapse Link is an excellent example of an HTAP implementation.

Based on my experience in Latin America, data leaders invest time and effort to have a unified platform to reduce their analytics infrastructure complexity and promote collaboration across crucial roles such as data engineers, data scientists and business analysts. I do get it because by doing so, they will be able to reduce costs, operate more efficiently, focus on organizational challenges and adapt better to continuous changes.

A unified platform should also enable data scientists to quickly develop, deploy and operationalize their machine learning models. The approach should enrich organizational data with predictions, meaning business analysts can incorporate those into their Power BI reports, shifting the insights from descriptive to predictive.

Organizations demand an easy-to-use experience where each data workload is purpose-built but deeply integrated. The potential evolution of your data architecture could be the concept of data lakehouse introduced by Databricks in 2020.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

How Decentralization Could Alleviate Data Biases In Artificial Intelligence

23 Jun, 2020

The Covid-19 outbreak has overwhelmed health systems around the world. At a point, bed spaces and ventilators for patients as …

Read more

The Time for Data-Driven HR Is Now

30 Mar, 2017

For decades, HR has lived by the adage, “Go with your gut”—trusting our instincts when it came to hiring decisions, …

Read more

What Would The Big Data From Your Brain Tell You?

22 Apr, 2017

Would you really want to know? Amongst his other amazing projects, Elon Musk wants to help hook us up to …

Read more

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.