Data hoarding is not a viable strategy anymore
- by 7wData
For years it has been normal practice for organizations to store as much data as they can. More economical storage options combined with the hype around big data encouraged data hoarding, with the idea that value would be extracted at some point in the future.
With advances in data analysis many companies are now successfully mining their data for useful business insights, but the sheer volume of data being produced and the need to prepare it for analysis are prime reasons to reconsider your strategy. To balance cost and value it’s important to look beyond data hoarding and to find ways of processing and reducing the data you’re collecting.
The volume of data that’s being produced daily is growing fast. People generate enormous amounts of data, but machine generated data is set to eclipse that. As the IoT grows from an estimated 23 billion connected devices this year to almost 31 billion by 2020 and a staggering 75 billion by 2025, according to IHS data at Statista, collecting and storing all that raw data is starting to look impractical.
We’ve kept pace with data generation so far by adopting better compression technologies and backing up incrementally with a focus on what has changed, but as the volume increases we’re going to fall woefully behind. We must find a way to reduce the amount of data that we’re collecting.
The most expensive way to store data is in its raw form, so we need to reduce it, extracting pertinent details like averages, or standard deviations. Streamlining the data we collect and processing it to ensure that it’s in a useful format seems an obvious answer, however, it’s not as easy as it sounds.
In some cases, it may be prudent to store raw data for future audits in the event of liability exposure. Regulatory requirements must also be weighed in when deciding what data to keep and what to let go of.
Part of the difficulty with boiling data down is that we’re still developing analysis through machine learning and artificial intelligence. That means we’re betting on what will be valuable and what we can afford to discard. It’s not practical or prudent to try and store all raw data, but there’s a balance to be found and much depends on your specific business.
Figuring out what data you want to keep and how the remaining data you’re collecting should be processed is just one piece of the puzzle.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More