5 top data challenges that are changing the face of data centers

5 top data challenges that are changing the face of data centers

Data is clearly not what it used to be! Organizations of all types are finding new uses for data as part of their digital transformations. Examples abound in every industry, from jet engines to grocery stores, for data becoming key to competitive advantage. I call this new data because it is very different from the financial and ERP data that we are most familiar with. That old data was mostly transactional, and privately captured from internal sources, which drove the client/server revolution. 

New data is both transactional and unstructured, publicly available and privately collected, and its value is derived from the ability to aggregate and analyze it. Loosely speaking we can divide this new data into two categories: big data – large aggregated data sets used for batch analytics – and fast data – data collected from many sources that is used to drive immediate decision making. The big data–fast data paradigm is driving a completely new architecture for data centers (both public and private).

Over the next series of blogs, I will cover each of the top five data challenges presented by new data center architectures:

New data is captured at the source. The volume of data collected at the source will be several orders of magnitude higher than we are familiar with today. For example, an autonomous car will generate up to 4 terabytes of data per day. Scale that for millions – or even billions of cars, and we must prepare for a new data onslaught. 

It is clear that we cannot capture all of that data at the source and then try to transmit it over today’s networks to centralized locations for processing and storage. This is driving the development of completely new data centers, with different environments for different types of data characterized by a new “edge computing” environment that is optimized for capturing, storing and partially analyzing large amounts of data prior to transmission to a separate core data center environment. 

The new edge computing environments are going to drive fundamental changes in all aspects of computing infrastructures: from CPUs to GPUs and even MPUs (mini-processing units)—to low power, small scale flash storage—to the Internet of Things (IoT) networks and protocols that don’t require what will become precious IP addressing.

Let’s consider a different example of data capture. In the bioinformatics space, data is exploding at the source. In the case of mammography, the systems that capture those images are moving from two-dimensional images to three-dimensional images. The 2-D images require about 20MB of capacity for storage, while the 3-D images require as much as 3GB of storage capacity representing a 150x increase in the capacity required to store these images. Unfortunately, most of the digital storage systems in place to store 2-D images are simply not capable of cost-effectively storing 3-D images. They need to be replaced by big data repositories in order for that data to thrive.

In addition, the type of processing that organizations are hoping to perform on these images is machine learning-based, and far more compute-intensive than any type of image processing in the past. Most importantly, in order to perform machine learning, the researchers must assemble a large number of images for processing to be effective. Assembling these images means moving or sharing images across organizations requiring the data to be captured at the source, kept in an accessible form (not on tape), aggregated into large repositories of images, and then made available for large scale machine learning analytics. 

Images may be stored in their raw form, but metadata is often added at the source.  In addition, some processing may be done at the source to maximize “signal-to-noise” ratios.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

Multicloud vs hybrid cloud: Which one is right for your organization?

2 Oct, 2022

Check out the pros and cons of multicloud and hybrid cloud deployment models, and get advice on what to consider …

Read more

4 Business Risks Preventing Big Data ROI

29 Dec, 2016

Evaluating risk vs. return of a big data initiative can be tricky, especially because the open source market is so …

Read more

How Hyperconvergence and Smart Cities Will Work Together

17 Feb, 2020

For many people who are not versed in the intricacies of IT and technological infrastructures, smart cities seem like a …

Read more

Recent Jobs

IT Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Data Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Applications Developer

Washington D.C., DC, USA

1 May, 2024

Read More

D365 Business Analyst

South Bend, IN, USA

22 Apr, 2024

Read More

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.