How to choose a cloud data management architecture

Donald Feinberg, vice-president and distinguished analyst in Gartner’s ITL Data and Analytics (D&A) group, explores the different kinds of cloud architecture for data management, and why D&A leaders need to balance the risks and benefits of each
The need for and use of data, be it customer or business data, is becoming increasingly advantageous for today’s organisations. It helps businesses stay competitive and ahead of the curve with intelligence to make smarter decisions, quickly.
However, it is important to recognise that a data-driven strategy can demand too much of a business – particularly if the right tools and solutions to manage those additional needs aren’t in place.
Solutions such as cloud data management architecture are critical, therefore. However, D&A leaders need to be aware of the different architecture choices – from on-premises to multi-cloud and intercloud. They need to understand the risks and benefits that come with managing data across diverse and distributed deployment environments.
Here, I take a look at the different cloud data management architectures and the considerations D&A leaders need to mindful of.
In an on-premises to cloud model (also known as “ground to cloud”), different components of an application architecture may reside on-premises and/or on one cloud. The database management systems (DBMS) might reside on-premises and the applications that connect to it may reside in the cloud — for example, a business intelligence (BI) dashboard application.
There are two variations of on-premises to cloud architectures:
An active approach, as its name implies, deals with active data management between the two environments. This may include architectures with data residing both in the cloud and on-premises, such as the ability of the DBMS to have some replicas, partitions or shards residing on-premises and some in the cloud for the same database.
There are many application use cases for this kind of functionality, including: partitioning data by age, frequency of access or geography; dynamic capacity allocation to accommodate inconsistent, surge demand on resources; and regulatory requirements governing data locality.
In an active on-premises to cloud model, it is critical to understand the characteristics around the flow of data (for example, whether data is flowing into or out of the cloud and the expected volumes of data). There may be issues with latency — that is, the time it takes to move the data between on-premises and cloud. Additionally, there may be financial implications driven by CSP data egress charges. Integration, metadata and governance practices that span multiple environments must also be considered. Service level agreements (SLA) should be defined and tested. This may lead to a requirement of a special communications link between the on-premises and cloud components, leading to greater financial cost implications.
In an on-demand approach, components remain separate. Data is moved between environments only when necessary to support business activities like disaster recovery planning or development lifecycle functions. For example, any of the development, test, quality assurance (QA), disaster recovery (DR) or production instances of a DBMS may reside on-premises or in the cloud.


