What is the Open Data Ecosystem and Why it’s Here To Stay
- by 7wData
Today’s companies are collecting massive amounts of data to better understand their customers and to make better, more informed business decisions. Frequently, all of this data resides across dozens, sometimes thousands of different sources and in multiple formats, both structured and unstructured.
Connecting all of this data and making sense of it is a massive and highly complicated task, but it’s essential. To be successful, companies have to be able to connect the dots across varied data sources and data types. Only then can they realize insights and take meaningful action.
Over the past decade, a series of technologies have come on the scene promising to solve this problem. Led by the Hadoop movement, it first began in the mid-2000s when products and companies started sprouting up, creating an Open Data Ecosystem. This movement towards composable technologies (oftentimes open source, but not necessarily) that integrate with APIs and run on commoditized hardware challenged the status quo of monolithic, interdependent architecture in a big way.
By adopting an open, distributed approach on commoditized hardware, these companies challenged the traditional setup of storing and processing data in proprietary, centralized data warehouses. But ultimately, these solutions under-achieved their grandiose promise because they became unwieldy, difficult to manage and economically unscalable.
More recently, we’ve witnessed the revival of the Open Data Ecosystem. Due to the rise of the cloud, a proliferation of open-source data formats and the arrival of vendors solving for earlier pain points, we’ve seen a new breed of open data ecosystem companies emerge and grow in popularity. These new solutions are able to capture the full scope of data that resides within a company, enabling teams to leverage the data to its full advantage.
At Sapphire Ventures , we’ve been investing in a range of enterprise technology companies spanning data, analytics, AI, open source, DevOps, security and more for more than a decade. Having met with hundreds of companies across these industries over the years, we’d like to think we know these areas well. And it’s our belief that open data ecosystem companies are now perfectly in the right place at the right time.
Big Data and the Birth of the Open Data Ecosystem
For decades, companies relied on traditional databases or warehouses, a mostly proprietary, centralized repository where structured data was stored and processed. The traditional data warehouse system required buying pricey on-premises hardware, maintaining structured data in proprietary formats and relying on a centralized data and IT department to deliver analysis.
This system -- a RDBMS or traditional data warehouse -- worked while enterprises collected a modest amount of structured data. But in the mid-2000s, companies like Google ran into challenges with this model. As pioneers in the internet economy, they had to process more raw data than anyone had before, a meaningful amount of which in non-relational format.
Google is just one example of a large corporation that needed a place to centrally process structured data (e.g., relational tables), semi-structured data (e.g., logs) and unstructured data (e.g. videos and photos).
At the time, there was no supercomputer big enough for this task. So to keep up, Google wired an ever-expanding number of computers together into a fleet.
Eventually, this computing infrastructure grew so big that hardware failures became inevitable, and each programmer had to figure out how to handle them individually. To address these challenges, MapReduce, which could process parallely and generate huge data sets over large clusters of commoditized hardware, was born.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
Shift Difficult Problems Left with Graph Analysis on Streaming Data
29 April 2024
12 PM ET – 1 PM ET
Read MoreCategories
You Might Be Interested In
No-code, self-service analytics are the shortcut to a data-driven business
28 Oct, 2021The vast majority of businesses have already recognised that data is not just a by-product of their operations, but a …
Can data and analytics help to improve student learning?
8 Jun, 2015The education landscape is shifting rapidly and with it the manner in which students learn. The latest generation of …
Difference Between RPA and RDA
22 Aug, 2021In today’s technology-driven digital era, businesses are constantly looking toward fast, efficient delivery by limiting human actions on certain mundane …
Recent Jobs
Do You Want to Share Your Story?
Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.