5 Common Challenges to Building BI in the Cloud
- by 7wData
Building successful Business Intelligence solutions is a well-documented process with many successful, and unsuccessful projects to learn from. The traditional BI/DW model has always been challenging, but a lot of good practices and patterns have emerged over the years that BI professionals can leverage.
A net-new BI solution or migration of an existing on-prem BI solution into the cloud creates a different set of challenges to be addressed. What I wanted to do was to try to come up with a top 5 list that may help you in considerations for your cloud BI project planning. I've been focused on building analytics, BI and Big Data solutions in the cloud in Azure for the past 2 years, so I'm going to share a few of my findings for you here.
1. Loading data into the Cloud In my experience, this is where you will spend the bulk of your time.
Getting access to data sources & loading large initial data sets into the cloud, not to mention also building out a cloud-based ETL infrastructure are challenging. You will have to address connectivity issues for hybrid scenarios, push large amounts of data into the Cloud and devise an archival strategy with cloud storage, which is quite different than a classic on-prem SAN appliance approach. If you are building a SQL Server-based BI solution in Azure IaaS VMs, use tools like AZCopy or Azure Data Factory to load your initial data sets into Azure Storage. Delta data loads will be a much-preferred approach after the initial data load and you can use Azure Data Factory, SSIS, Attunity, Informatica, etc. tools for that purpose.
When archiving data and storing backups, use less expensive Azure storage such as standard storage and consider DRaaS tools like Azure Site Recovery (ASR). You will want to use Premium disks for running VM workloads in Azure, but you can use Standard for backups, which you can also use to geo-replicate your backups for data protection.
If you are using shared, managed public services in Azure like SQL DB, SQL DW, Azure Analysis Services, Power BI, etc. you will find native built-in cloud adapters and capabilities that allow you to connect to data in Azure Storage, Azure Data Lake, HD Insight and other cloud-native sources. But in most cases, unless you are working on a net-new application for a new business, you will need to deploy a hybrid architecture which also needs to bring in data from on-prem data sources. In that case, you will need to install local proxy data management gateways for data movement and for Power BI.
This is a very complex and lengthy topic that requires different tools for different architectural approaches in Azure. I spend a bit more time reviewing these in more detail in this story here.
You will want to understand the latency, response times and geo-locations of your user requirements before planning your cloud BI deployment. When connecting to a public cloud, be aware that general cloud services are public IP addresses which require firewall whitelisting in many cases for controlled access. You can also configure virtual networks and VPNs to connect your corporate network to many, but not all, of the services in Azure. And if you require low-latency, high-availability connectivity to Azure, consider purchasing an Express Route circuit for direct connectivity. Just be sure to set aside the time and tasks early in your project cycle necessary to establish connectivity requirements.
Public IPs, firewalls, load balancers, VNets and VPNs will all still need to be configured even though you are leveraging a shared cloud platform. If you are going to use Power BI, which is an Office 365 public cloud service, there is new offering being released called Power BI Premium that can provide dedicated capacity for larger Power BI implementations. Take a look at this whitepaper for more on PBI Premium.
Data governance is an important aspect of any analytics project.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More