Dremio: Simpler and faster data analytics

Dremio: Simpler and faster data analytics

Now is a great time to be a developer. Over the past decade, decisions about technology have moved from the boardroom to innovative developers, who are building with open source and making decisions based on the merits of the underlying project rather than the commercial relationships provided by a vendor. New projects have emerged that focus on making developers more productive, and that are easier to manage and scale. This is true for virtually every layer of the technology stack. The result is that developers today have almost limitless opportunities to explore new technologies, new architectures, and new deployment models.

Looking at the data layer in particular, NoSQL systems such as MongoDB, Elasticsearch, and Cassandra have pushed the envelope in terms of agility, scalability, and performance for operational applications, each with a different data model and approach to schema. Along the way many development teams moved to a microservices model, spreading application data across many different underlying systems.

[ Which NoSQL database should you use? Let InfoWorld be your guide. NoSQL standouts: The best key-value databases . | NoSQL standouts: The best document databases . | Keep up with the hottest topics in programming with InfoWorld’s App Dev Report newsletter . ]

In terms of analytics, old and new data sources have found their way into a mix of traditional data warehouses and data lakes, some on Hadoop, others on Amazon S3. And the rise of the Kafka data streaming platform creates an entirely different way of thinking about data movement and analysis of data in motion.
With data in so many different technologies and underlying formats, analytics on modern data is hard. BI and analytics tools such as Tableau, Power BI, R, Python, and machine learning models were designed for a world in which data lives in a single, high-performance relational database. In addition, users of these tools – business analysts, data scientists, and machine learning models – want the ability to access, explore, and analyze data on their own, without any dependency on IT.

Introducing the Dremio data fabric
BI tools, data science systems, and machine learning models work best when data lives in a single, high-performance relational database. Unfortunately, that’s not where data lives today. As a result, IT has no choice but to bridge that gap through a combination of custom ETL development and proprietary products. In many companies, the analytics stack includes the following layers:
Data staging. The data is moved from various operational databases into a single staging area such as a Hadoop cluster or cloud storage service (e.g., Amazon S3).

Data warehouse. While it is possible to execute SQL queries directly on Hadoop and cloud storage, these systems are simply not designed to deliver interactive performance. Therefore, a subset of the data is usually loaded into a relational data warehouse or MPP database.
Cubes, aggregation tables, and BI extracts. In order to provide interactive performance on large datasets, the data must be pre-aggregated and/or indexed by building cubes in an OLAP system or materialized aggregation tables in the data warehouse.
This multi-layer architecture introduces many challenges. It is complex, fragile, and slow, and creates an environment where data consumers are entirely dependent on IT.
Dremio introduces a new tier in data analytics we call a self-service data fabric.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

4 Things Data Scientists Can Learn From SoundCloud’s Process

31 Oct, 2017

According to a recent blog post from the SoundCloud team, the company has recently restructured and reorganized the way its …

Read more

Modern Data Integration for Better Decisions and Outcomes

9 Apr, 2016

The insatiable demand for data continues unabated. We want to gain deeper insights into market trends, customers, competitors and our …

Read more

CIOs beginning to deliver real value from machine learning

24 Oct, 2017

 A survey of 500 chief information officers (CIOs) from around the world by ServiceNow has found that machine learning has …

Read more

Recent Jobs

Senior Cloud Engineer (AWS, Snowflake)

Remote (United States (Nationwide))

9 May, 2024

Read More

IT Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Data Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Applications Developer

Washington D.C., DC, USA

1 May, 2024

Read More

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.