Learn from Google’s Data Engineers: Dimensional Data Modeling is Dead Blog

Learn from Google’s Data Engineers: Dimensional Data Modeling is Dead

by 7wData
February 16, 2022

My first day at Google was in 2019, and one of my biggest surprises was that I didn’t find any dimensional data marts. I was used to joining facts to dimensions and could rattle off normal forms and preach best practices of data modeling. I considered myself an expert on slowly-changing dimensions and how to apply them. Dimensional data modeling, popularized by Ralph Kimball with his 1996 book, is a method of organizing data within a data warehouse. While many benefits are preached, I believe that it exists for 3 primary reasons; optimizing compute, organizing data by topic, and optimizing storage. These foundational purposes driving the advent of dimensional modeling have evolved over time. Let’s take this moment in history to revisit why dimensional modeling exists, and how its roots meet our needs today.

In early days of computing, storage cost a premium; as much as $90,000 in 1985. At this rate, data needed to be organized in such a way that minimized the number of times the same value would be stored. To do this, pointers were used in the form of database keys, to point a unique identifier to many records of the same value. Database normalization came about to describe how de-duplicated and storage-optimized a database had become. Instead of storing a long descriptive value every time a dimension existed in the table, we could store this description just once, and connect it to the record it is meant to describe. With the fastest computer putting out 1.9 gigaflops of computational power in 1985 for a whopping $32M, compute-optimization was equally as important. For reference, some of the fastest computers today put out over 400 petaflops of compute, or 20,000 times more flops. In the early days of databases, reducing the number of operations and the average complexity of each operation could save companies millions. For one example, instead of needing to analyze long complex string values to relate records, we could assign a single integer to unique instances, and relate these integers instead of strings. In order to accomplish these optimizations, data was organized into topic-oriented data models. One common topic-oriented model was the star schema.

The benefit of this model was that the center of the star; the fact table, contained values that were physically indexed and easily retrievable. The more costly values to retrieve were stored on dimension tables and could be selectively retrieved, saving on processing cost. If new dimensions were to arise relating to the fact, new dimension tables would have to be put in place, key relationships would need to be enforced, and the process of normalization compliance would be maintained. In successful dimensional modeling, the source data tables were torn apart, distributed among many tables, and if done correctly, could be re-assembled back to the source table if necessary.

Database normalization is showing its age. The cost of 1gb of Google Cloud storage per month is just 2 cents. The returns for breaking a long and/or wide table down into a star or snowflake schema are poor.

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Learn from Google’s Data Engineers: Dimensional Data Modeling is Dead

Leave a Reply Cancel reply

Upcoming Events

MarkLogic World | Amsterdam

Knowledge Graph — The Ultimate Center of Excellence

From Text to Value: Pairing Text Analytics and Generative AI

Bringing Data Closer to Decision Makers with Data Fabric

Categories

Tags

You Might Be Interested In

7 Ways To Improve Your Business Using Business Analytics

Big data vs. the Internet of Things: how the projects differ

5 tips for building your innovation ecosystem

Recent Jobs

IT Engineer

Data Engineer

Applications Developer

D365 Business Analyst

Do You Want to Share Your Story?

Join our community

Our Services

Company

Work With Us

Follow Us

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.

Learn from Google’s Data Engineers: Dimensional Data Modeling is Dead

Leave a Reply Cancel reply

Upcoming Events

Categories

Tags

You Might Be Interested In

Recent Jobs

Do You Want to Share Your Story?

Join our community

Our Services

Company

Work With Us

Follow Us

Get the 3 STEPS

To Drive Analytics Adoption And manage change

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.

To Drive Analytics Adoption
And manage change