Who manages data lakes and what skills are needed?

Who manages data lakes and what skills are needed?

Among the most common components of modern data architecture is the use of a Data Lake, which is a location where data flows in to serve as a central repository.

The concept of the Data Lake has evolved from being just a location for data collection to a more organized approach known as a data lakehouse. Whether it's called a data lake or a data lakehouse, there is a need for certain skills and IT professionals to effectively manage the technology.

A data lake is a large open storage location that typically uses object storage as a unified repository for unstructured data coming from multiple sources. Those sources can include event streaming data, operational and transactions data and databases. While data lakes can be in on-premises environments, they are more commonly created with cloud object storage services that enable large scalable data capacity, such as Amazon Simple Storage Service (S3), Google Cloud Storage or Microsoft Azure Data Lake Storage.

Data lakes first emerged to help enable big data workloads with the Apache Hadoop big data platform. A data lake architecture differs from a Data warehouse in that warehouse data is transformed into a format that provides structured data and organization.

A Data warehouse enables users to more easily query the data and use it for data analytics and business intelligence use cases. Data warehouses also provide data governance and data management capabilities.

The concept of the data lakehouse -- first coined by Databricks -- is an attempt to bring together the best of data lakes and data warehouse technologies. A data lakehouse aims to combine the ease of use and open nature of a data lake with the data warehouse's ability to easily execute queries against data.

A data lakehouse provides additional structure on top of a data lake -- often with the use of a data lake table format technology, such as Delta Lake, Apache Iceberg and Apache Hudi. It also uses a query engine technology, such as Apache Spark, Presto and Trino.

Managing data within an organization can be a multi-stakeholder effort. It can involve different job roles depending on the particular use case. Data warehouses are often managed by data warehouse managers and data warehouse analysts. Those two roles involve data management and data analytics skills, which are typically tied to a specific data warehouse vendor technology. Data lake management is often the domain of data engineers, who help design, build and maintain the data pipelines that bring data into data lakes.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

How to Move your Digital Transformation Strategy Forward (Hint: It’s not a Crisis Response)

14 Jul, 2020

Without question, the COVID-19 pandemic has laid bare consumers’ dependence on real-time digital services and experiences. When customers order groceries …

Read more

7 top trends that will impact data management and cloud computing in 2020

18 Dec, 2019

In 2019 we saw the beginnings of the new data organization. Teams dedicated to supporting growing AI and analytic workloads …

Read more

AI Spending Planned For 2018 But Talent And Security Obstacles Challenge IT

9 Mar, 2018

I help firms understand AI, mobile and cloud to improve their businessOpinions expressed by Forbes Contributors are their own. Artificial …

Read more

Recent Jobs

Senior Cloud Engineer (AWS, Snowflake)

Remote (United States (Nationwide))

9 May, 2024

Read More

IT Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Data Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Applications Developer

Washington D.C., DC, USA

1 May, 2024

Read More

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.