Breaking Down Data Silos

Breaking Down Data Silos

Predictive analytics, data science, artificial intelligence, bots. The waves of advances in the application of data keep on coming. You can’t read the pages of the mainstream or business media without being impressed by the opportunity. Yet, although the power of analytics is common currency, it’s spoken of far more often than it’s practiced. The biggest obstacle to using advanced data analysis isn’t skill base or technology; it’s plain old access to the data.

Every CIO I meet tells me that they are excited at the potential of analytics for their business. With one caveat — they can’t get their hands on the data in the first place. Embracing data as a competitive advantage is a necessity for today’s business, so why is it so hard to get access to the data we need?

There is a cost to using data. Behind the glamor of powerful analytical insights is a backlog of tedious data preparation. Since the popular emergence of data science as a field, its practitioners have asserted that 80% of the work involved is acquiring and preparing data. Despite efforts among software vendors to create self-service tools for data preparation, this proportion of work is likely to stay the same for the foreseeable future, for a couple of reasons.

First, you can’t cleanly separate the data from its intended use. Depending on your desired application, you need to format, filter, and manipulate the data accordingly. Every new problem has its unique aspects that usually reach back into data acquisition and preparation. Second, data confers insight and advantage. Once you have harvested the low hanging fruit (the easy-to-prepare data), then you’re falling behind if you’re not looking for the next level of insight. So you must pursue the data which is harder to find and use, driving the amount of time spent in prep up.

But there is a bigger and costlier demon that lurks in enterprises. A demon that can drive up that 80% and often makes initiatives impossible: data silos. These silos are isolated islands of data, and they make it prohibitively costly to extract data and put it to other uses. They can arise for multiple reasons.

Structural. Software applications are written at one point in time, for a particular group in the company. In a world of limited resources, applications are optimized for their main function. The incentives of individual teams are unlikely to encourage data sharing as a primary requirement. This focus on function, for instance, may result in recent sales being stored in different systems from historical sales, thus presenting an immediate barrier to boosting sales through personal product recommendation.

Political. Knowledge is power, and groups within an organization become suspicious of others wanting to use their data. And often with some justification, as the scope for misuse, even accidental, is broad. Data isn’t a neutral entity — you must interpret it with knowledge of its history and context. This sense of proprietorship can act against the interests of the organization as a whole.

Growth. Any long-lived company has grown through multiple generations of leaders, philosophies, and acquisitions, resulting in multiple incompatible systems. Even if there are no political issues in integrating data, it is costly to reconcile and integrate sets of data that embody different approaches to important business concepts.

 

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

The Impact of MiFID II on Data Management

9 May, 2018

Recently, the revised Markets in Financial Instruments Directions (MiFID II) launched in the EU. The sweeping regulatory changes will impact …

Read more

Modern Data Stack in a Box with DuckDB

23 Oct, 2022

TLDR: A fast, free, and open-source Modern Data Stack (MDS) can now be fully deployed on your laptop or to …

Read more

Can We Say Goodbye to Data Silos?

14 Feb, 2022

The biggest headache in big data, arguably, is the proliferation of data silos and the need to integrate them. We …

Read more

Recent Jobs

IT Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Data Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Applications Developer

Washington D.C., DC, USA

1 May, 2024

Read More

D365 Business Analyst

South Bend, IN, USA

22 Apr, 2024

Read More

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.