Anatomy of a modern data stack and 4 key benefits it creates

3 min read

Construction firms that know how to harness their data are increasingly at a competitive advantage in today’s complex world — and the latest research emphasizes just how much of an advantage.

In fact, faulty construction data may have caused $1.8 trillion in losses worldwide and been responsible for 14% of avoidable rework, or $88 billion, according to Autodesk and FMI.

That same report found that 75% of contractors said there’s an increased need for rapid decision-making in the field — exactly where good data is crucial. But only 55% of contractors had implemented a formal data strategy for project data, and only 12% always incorporated project data into their decision-making.

The solution to these problems is the modern data stack. But just what is it, and why should contractors care?

“The modern data stack is a scalable, low barrier to entry, group of technologies that firms can adopt to drive value from their data,” said Matt Monihan, CEO of ResponseVault, a data-engineering firm specializing in the construction industry. “That’s important because, with the modern data stack, you can surface data without every single app having a direct integration with another one.”

Get the AI & data signal, daily.

335k+ subscribers read this every morning. One email, both newsletters. Unsubscribe anytime.

Monihan said the goal is true data integration, which many construction firms mistakenly believe they have achieved because of questionable claims from software-makers about integration. But, while integrations may technically be available, they don’t always provide true data insights firms need to make smarter decisions and predict outcomes. “The granularity of the integration is key and varies between vendors,” Monihan said.

In this article, we’ll explore the anatomy of the modern data stack — and answer questions about four key benefits it creates.

Point solutions are where your data originates. Whether it’s coming from the field, the office or the owner, your data is being collected in a structured form, like Procore change events, or from free-form data sources like Spreadsheets. The data generated from these point solutions run your business, and the solutions are made to collect the data properly.

Once you’ve collected your jobsite data in point solutions, the next step is to securely and reliably export that data into a storage container, often referred to as a data warehouse, data lake or even a data lakehouse. As technology evolves, the differences between those industry terms have blurred, but what’s important is that a piece of middleware is required to move the data between the point solution and its staging area in the warehouse.

The data we’re extracting from our point solutions needs to live somewhere, and that is where our storage method is chosen. The cost of entry to this component has reduced, both with the introduction of Amazon Redshift as a lower-cost analytical database and with the rise of accessible, open-source databases introducing features that enable many use cases that weren’t previously possible in years past. So, once you’ve selected and set up your storage and data is flowing, next is doing something with the data: Analysis.  

Now is the time to model the data across your data sources, identify fields that combine disparate data sets, and clean the data into unified models. This step requires a dedicated data analyst who can communicate with people in the field who are generating the source data and reconcile any discrepancies with stakeholders looking for reports and dashboards.

Continue Reading

Enjoyed this summary? Read the complete article at the source:

Continue at constructiondive.com →

Yves Mulkers

Yves Mulkers is the founder of 7wData and a widely followed voice in the data and AI community. He curates the 7wData and AI Beat newsletters, reaching hundreds of thousands of data and AI professionals, and writes on data strategy, analytics, AI, and the evolving data ecosystem.