Democratize big data by using distributed data lakes

2 min read

A democratization of the big data and analytics process can’t come soon enough for many organizations.

This point was made clear during a talk last week with Michele Goetz, principal analyst for Forrester, and Ben Szekely, vice president and founding engineer for solutions and pre-sales at Cambridge Semantics, a provider of big data analytics tools for end users.

Goetz shared this Forrester survey research: 67% of companies can’t access big data, 59% can’t integrate it, and 56% say that the update process for big data is very slow. This isn’t good news for businesses that have spent several years investing into big data, and that are undoubtedly expecting more aggressive returns on their big data investments. Meanwhile, Forrester projects that on a worldwide basis, the amount of data under management will soar from 4.4 zettabytes to 44 zettabytes.

Get the AI & data signal, daily.

335k+ subscribers read this every morning. One email, both newsletters. Unsubscribe anytime.

“There is a need to introduce intelligence into the process of interpreting all of this data so the consumers of this data can be empowered to use it through self-service data access,” said Goetz.

Many companies are coming to this conclusion, as 59% of the respondents in the Forrester survey said that they will either expand or implement new data preparation capabilities within the next 12 months so they take better advantage of the data they are collecting.

However, dealing with the big data preparation process has been anything but fast. Research reveals that data scientists can spend from 50 to 80% of their time collecting, cleaning, and preparing big data, which comes in all sizes and formats. If your plan is to democratize and distribute these data preparation tasks, which are already burdensome, tools have to be built for the task.;

 

Yves Mulkers

Yves Mulkers is the founder of 7wData and a widely followed voice in the data and AI community. He curates the 7wData and AI Beat newsletters, reaching hundreds of thousands of data and AI professionals, and writes on data strategy, analytics, AI, and the evolving data ecosystem.