Orchestrating Data Pipelines Facilitates Data-Driven Analytics
- by 7wData
I have written a few times in recent months about vendors offering functionality that addresses data orchestration. This is a concept that has been growing in popularity in the past five years amid the rise of Data Operations (DataOps), which describes more agile approaches to data integration and data management. In a nutshell, data orchestration is the process of combining data from multiple operational data sources and preparing and transforming it for analysis. To those unfamiliar with the term, this may sound very much like the tasks that data management practitioners having been undertaking for decades. As such, it is fair to ask what separates data orchestration from traditional approaches to data management. Is it really something new that can deliver innovation and business value, or just the rebranding of existing practices designed to drive demand for products and services?
Key to understanding why data orchestration is different, and necessary, is viewing data management challenges through the lens of modern data-processing requirements and challenges. As I have noted, data-driven organizations stand to gain competitive advantage, responding faster to worker and customer demands for more innovative, data-rich applications and personalized experiences. Being data-driven requires a combination of people, processes, information and technology improvements involving data culture, data literacy, data democracy, and data curiosity. Encouraging employees to discover and experiment with data is a key aspect of being data-driven that requires new, agile approaches to data management. Meanwhile, the increasing reliance on real-time data processing is driving requirements for more agile, continuous data processing. Additionally, the rapid adoption of cloud computing has fragmented where data is accessed or consolidated, with data increasingly spread across multiple data centers and cloud providers.
Traditional approaches to data management are rooted in point-to-point batch data processing, whereby data is extracted from its source, transformed for a specific purpose, and loaded into a target environment for analysis. These approaches are unsuitable for the demands of modern analytics environments, which instead require agile data pipelines that can traverse multiple data-processing locations and can evolve in response to changing data sources and business requirements. I assert that by 2024, 6 in ten organizations will adopt data-engineering processes that span data integration, transformation and preparation, producing repeatable data pipelines that create more agile information architectures. Given the increasing complexity of evolving data sources and requirements, there is a need to enable the flow of data across the organization through new approaches to the creation, scheduling, automation, and monitoring of workflows. This is the realm of data orchestration.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More