The truth about in-memory computing Blog

The truth about in-memory computing

by 7wData
July 24, 2017

A few weeks back one of my favourite analysts, Merv Adrian tweeted the following:

““Just move it to memory and it will speed up.” Not so fast (pun intended.) Serious engineering required – even for a KV store. ”

I could not help but smile when I saw this. I’ve spent years telling anyone who would listen that putting data into memory doesn’t instantly transform software, originally written for disk-based data, to “in-memory”.

In 1988, at White Cross Systems (a pioneer in MPP in-memory systems, which later evolved into Kognitio) we set out to use the concept of MPP to build a database that would support, what we now call data analytics, but was then called Decision Support. Most databases at that time were designed and optimised for transaction processing, rather than decision support and so we were effectively starting from scratch. We wanted to build a system that was fast enough to support train of thought analysis and could scale linearly to support large and growing data volumes.

We never set-out to build an in-memory system, but it became clear to us early on, that if we wanted to exploit massive parallelisation, then we could not be limited by disk IO speeds. Reading data from slow physical disks seriously limits the amount of parallelisation you can effectively deploy to any task, as the CPUs (processors) very quickly became starved of data as everything became disk IO bound.

This is the most basic and important point that is often missed when talking about in-memory. It’s not the putting of data “in memory” that makes things faster. Memory, like disk, it is just another place to park the data. It’s the Processors or CPUs that run the actual data analysis code. Keeping the data in memory allows the CPUs fast access to the data, keeping them fed with data and enabling parallelisation.

For this reason we decided to build a system which kept the data of interest in fast computer memory or RAM (Random Access Memory). In retrospect this was a brave decision to make in the late 80s. Memory was still very expensive, but because we were rather young and naïve, we believed that the price would fall relatively quickly making the holding of large data sets in-memory, an economical proposition. Ultimately we were right, even if it did take a couple of decades longer than we thought!

The point I’m making is this. When we took the decision to go in-memory, it dramatically changed our code design philosophy. Not being disk IO bound meant we became CPU bound, so code efficiency became hugely important. Every CPU cycle was precious and needed to be used as effectively as possible. For example, in the mid 90s, we incorporated “dynamic code generation” into the software, a technique that involves dynamically turning the execution phase of any query into low level machine code, which is then distributed across all of the CPUs in the system. This technique reduced code path lengths by 10-100 times. I am not saying that advanced techniques like machine code generation are essential components of an in-memory system but I am saying that using an efficient programming language is important when machine cycles matter. So probably not JAVA.

Designing code specifically for in-memory also has another important benefit because, besides being faster, RAM is also accessed in a different way to disk.

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

The truth about in-memory computing

Leave a Reply Cancel reply

Upcoming Events

MarkLogic World | Amsterdam

Knowledge Graph — The Ultimate Center of Excellence

From Text to Value: Pairing Text Analytics and Generative AI

Bringing Data Closer to Decision Makers with Data Fabric

Categories

Tags

You Might Be Interested In

4 Data Virtualization Vendors to watch in 2017

Career roadmap: Data engineer

Bring structure to your data and simplify your life

Recent Jobs

IT Engineer

Data Engineer

Applications Developer

D365 Business Analyst

Do You Want to Share Your Story?

Join our community

Our Services

Company

Work With Us

Follow Us

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.

The truth about in-memory computing

Leave a Reply Cancel reply

Upcoming Events

Categories

Tags

You Might Be Interested In

Recent Jobs

Do You Want to Share Your Story?

Join our community

Our Services

Company

Work With Us

Follow Us

Get the 3 STEPS

To Drive Analytics Adoption And manage change

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.

To Drive Analytics Adoption
And manage change