Four perspectives on data lakes
- by 7wData
Recently I was involved in creating a series of short videos about data lakes with a number of other IBM colleagues. These videos introduce four perspectives which cover the areas of architecture, value, innovation and governance. Data lakes are a very popular concept in the industry at the moment, but definitions of a data lake seem to vary widely
My view is that a data lake is a reference architecture that balances the desire for easy access to data with information governance and security. The data lake reference architecture describes the technical capabilities necessary for a system of insight, while being independent of specific technologies. Being technology independent is important because most organizations already have investments in data platforms that they want to incorporate in their solution. In addition, technology is continually improving, and the choice of technology is often dictated by the volume, variety, and velocity of the data being managed.
A system of insight needs more than technology to succeed. The data lake reference architecture includes description of governance and management processes and definitions to ensure the human and business systems around the technology support a collaborative, self-service, and safe environment for data use.
Governance is a practice that you apply to “something.” Just like James Watt’s fly-ball governor for the steam engine, a governance program seeks to keep a engine in balance so it works effectively. This engine may be a process, organization, or flow of information. The important point is that the target of what you are governing is clearly defined.
Approaches to governance, particularly around a data lake, vary widely due to the different choices that organizations make in their definition of the engine being managed. For example, the IT department may see the data lake engine as a collection of technology working together. The business may see the data lake as part of an innovation engine helping them to create new value from data. So which is the right engine to govern? It depends on the objective for data lake.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More