How Does a GPU Database Play in Your Machine Learning Stack? Blog

How Does a GPU Database Play in Your Machine Learning Stack?

by 7wData
May 21, 2017

Machine learning (ML) has become one of the hottest areas in data, with computational systems now able to learn patterns in data and act on that information. The applications are wide-ranging: from autonomous robots, to image recognition, drug discovery, fraud detection, etc.

At the cutting edge is deep learning, which draws its inspiration from the networks of neurons that comprise the cerebral cortex. These networks are massively parallel. As such, it’s no surprise that an increasing number of ML approaches are turning to graphical processing units (GPUs)—a key hardware component for general-purpose parallel computation.

Kinetica has been leveraging GPUs for massively parallel data analysis since 2012. As an in-memory analytical database, Kinetica is able to utilize multiple GPUs across many nodes to perform massively parallel statistical and analytical queries. Users can also apply custom code for analytical processing by leveraging user-defined functions, allowing Kinetica to integrate with a growing number of GPU-accelerated ML libraries, such as TensorFlow, Caffe, Torch, and BIDMach.

But this raises the question: if your ML library is already leveraging GPUs, what does Kinetica add to the ML stack?

Kinetica is tried and tested in large-scale enterprises, with production clusters deployed over dozens of nodes. At this scale most ML models are trained on subsets of the raw data, and most do not actually retain this raw data. Instead, they use the raw data to learn a state (e.g., the strengths of various network connections) before disposing of it—or siloing it in a data warehouse, never to be seen again.

With Kinetica, data can be stored in-memory and be rapidly accessed by the ML model as necessary. One key advantage to having the data closely integrated means that the user can always go back and fit their model as necessary.

Consider an example using time series data. It turns out that by learning the data in two stages—first forwards in real time but then again backwards—you will generally achieve a better overall fit to the entire dataset (i.e., Kalman smoothing vs. Kalman filtering).

To return to the neuroscience analogy, there is a close parallel to wake-sleep cycle animals. The networks of the brain are thought to learn online throughout the course of the day but require a period of sleep in which these model are re-fit to stored memories, most famously in the auto-associative networks of the hippocampus.

Theorists in machine learning have long been aware of the No Free Lunch Theorem. Simply put, there is no magic algorithm that can perform any better than any other in general — that is, when averaged over all conceivable inputs. What this means is that ML models can only succeed to the extent they are well-constructed for the problem at hand. A model that has been developed for image recognition is unlikely to do well when applied to credit card fraud.

This is true even with deep learning. It is often asserted that deep learning is a fundamentally new innovation that solves the feature selection problem—that is, deep learning will learn features from raw data obviating the need for feature selection. Unfortunately, there is no getting around the No Free Lunch Theorem.

Let’s again consider the cerebral cortex. It is certainly true that the cortex is capable of selecting and refining features via feedback, such as in the the early visual cortex. But note that before even arriving in the cortex, visual information has been extensively filtered, such as in the complex circuitry of the human retina. And most of this is fairly hard-wired: if the rules of physics suddenly changed, your eyes would probably not be of much use.

What this means for ML is that models can benefit enormously from incorporating field expertise and the discovered insights of data scientists.

Here Kinetica is an invaluable addition to your machine learning stack.

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

How Does a GPU Database Play in Your Machine Learning Stack?

Leave a Reply Cancel reply

Upcoming Events

MarkLogic World | Amsterdam

Knowledge Graph — The Ultimate Center of Excellence

From Text to Value: Pairing Text Analytics and Generative AI

Bringing Data Closer to Decision Makers with Data Fabric

Categories

Tags

You Might Be Interested In

AI in Workforce Management

How Machine Learning Helps With Fraud Detection

How can you secure big data in the information age?

Recent Jobs

IT Engineer

Data Engineer

Applications Developer

D365 Business Analyst

Do You Want to Share Your Story?

Join our community

Our Services

Company

Work With Us

Follow Us

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.

How Does a GPU Database Play in Your Machine Learning Stack?

Leave a Reply Cancel reply

Upcoming Events

Categories

Tags

You Might Be Interested In

Recent Jobs

Do You Want to Share Your Story?

Join our community

Our Services

Company

Work With Us

Follow Us

Get the 3 STEPS

To Drive Analytics Adoption And manage change

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.

To Drive Analytics Adoption
And manage change