See More

Should You Move Your Enterprise Data Warehouse to the Cloud?

In many companies, the enterprise data warehouse (EDW) plays a central role in information analysis and decision making. For companies that have them, EDW’s often represent a significant investment in terms of time, cost and resources. They are often used by multiple business units and departments either natively or as a source of data for local data marts. Re-platforming or replacing them can be a significant effort with a large impact on the organization as a whole. Most data warehouses that have been developed prior to the last several years were developed as on-premise or hosted solutions. So, it is natural for there to be some hesitancy when considering a move to the cloud. That said, over the past couple of years, cloud data warehouses now come with some compelling benefits that make it worthwhile to consider the option. First, let’s review some of the benefits of cloud data warehouses, then we will discuss some considerations for deciding whether moving the EDW to the cloud is the right decision for your organization. 

Benefits of a Cloud Data Warehouse:

1. Scalability:

    • Cloud data warehouses can be scaled automatically when data increases or additional processing power is required.

2. Scale Compute and Storage Separately:

    • Unlike on-premise data warehouses or even Hadoop clusters, where scaling means an increase in both storage and compute, many cloud data warehouses can scale storage and compute separately. So, if you need more storage but you don’t need more compute power, you can simply increase just the storage. This can represent a significant cost savings.

3. Pay As You Go:

    • Many of the cloud data warehouse platforms support a pay-as-you-go model. So you can “turn off” the data warehouse at night when no one is using it. In such a situation, you will pay only for the persistent storage of your data in the cloud.

4. Reduce Admin Burden on IT Staff:

    • Most cloud data warehouses are provided as managed services by the cloud vendor. The cloud vendor will take backups, do patching, and perform maintenance so in-house IT staff no longer needs to do that work.

5. Query Performance:

    • Anyone that has worked in data warehousing and business intelligence knows that poor query performance can fully negate all the good integration and design work that allowed the query to be run in the first place. Users simply won’t use query tools and platforms that have poor performance. Cloud data warehouses often scale automatically based on workload. So, poor query performance becomes much less of a factor in user satisfaction with cloud data warehouses. Also, in house DBA’s don’t need to spend time improving performance and can be redeployed to strategic initiatives.

6. Different Compute Profiles for Different User Communities:

    • As companies become more data driven and find new and innovative ways to leverage data, so too do the profiles of user communities increase. Perhaps the data science community requires significantly more processing power then the business analyst community. With cloud data warehouses, these two communities can be given different compute profiles rather than having to share the same one.

7. Don’t Have to Purchase for Peak Capacity:

    • With on-premise data warehouses, you typically had to estimate peak capacity, both in terms of storage and compute, and then purchase for that capacity to be able to satisfy the users. What that meant was a lot of the storage or processing capacity was left unused and was “wasted”. With the automatic scalability and elasticity of cloud data warehouses, you don’t have to purchase for peak capacity. You can provision the increased computing power or storage exactly when you need it and for only as long as you need it.

8. Data Lake Integration and Support of Additional Data Types:

    • With the recent explosion of data available for analysis, non-traditional data sources have become very important. Companies want to be able to analyze social media feeds, web logs, machine logs, sensor data, customer chat logs, etc. They are creating data lakes in the cloud to capture all types of data and to serve as an initial “landing area” for all the organization’s data. Some cloud data warehouses can also serve as the data lake for the organization by using object storage to support the ability to store and report on semi structured data such as XML and JSON. Alternatively, a cloud data warehouse can easily integrate with a cloud data lake for data that needs to be transformed, curated, reconciled and made available to a community of non-technical users.

9. Easier use of AI and ML Capabilities:

    Artificial Intelligence (AI) and machine learning (ML), and the predictive and prescriptive analytics they enable, are becoming essential elements of a modern analytics environment. Processing for AI and ML is increasingly moving to the cloud due to the cost and maintenance burden associated with on prem big data clusters. A cloud data warehouse can more easily integrate with cloud-based AI and ML workloads both in terms of providing data for analysis and receiving back the results of the AI/ML analysis for visualization and exploration by business analysts.

As we can see, there are a number of significant benefits to moving an EDW to the cloud. However, that does not mean it is the right decision for all organizations. Let’s take a look at some things to consider when deciding whether to move the EDW to the cloud.

Decision Factors for Moving EDW to the Cloud:

1. Cost and Maintenance Burden of Existing EDW:

Is the existing EDW a costly solution that requires a significant maintenance burden? Are there frequent errors and downtime? Is performance a problem?

2. Data Center Strategy of the Organization:

What are the short and long term goals for data centers in the organization? Is it likely that a request will be by made by Senior Management to move the EDW to the cloud in the future?

3. Age of Existing EDW and Version of Technology Platforms:

How old is the existing EDW? Has it been kept up to date in terms of running the latest versions of ETL, database and business intelligence software? Is the vendor still supporting the version? If not, what is the cost of upgrading on prem vs moving to the cloud?

4. Satisfaction Level of Existing Users:

Are the users satisfied with the existing EDW? How long and complex are the requested enhancements on the enhancements queue?

5. Evolving Analytics Landscape and the Need for Agility:

Are the requirements for the EDW changing rapidly? Does the EDW need to integrate with other analytics initiatives like data lakes, AI/ML or Big Data processing? What are the expansion plans for the business?

6. Scalability and Elasticity:

Is there a need for the auto scaling and elastic capabilities of the cloud? Are the workloads for the existing warehouse stable and relatively stagnant or are they dynamic or trending in that direction?

7. Other cloud Initiatives of the Organization

– are there other initiatives in the organization to move source systems to the cloud or to move business intelligence tools to the cloud?

8. Security and Compliance

– are there any security or compliance issues that would prevent moving the EDW to the cloud (this is becoming less and less of an issue as the major cloud vendors receive security and compliance certifications).

The decision to move the EDW to the cloud should be made with a careful and thoughtful review of the considerations above. The cloud offers great promise for EDW’s and analytics in general but it is important to fully understand and define the objectives, costs and impact before deciding upon an initiative to move your EDW to the cloud.

We have helped many companies define their data warehouse and analytics roadmaps including whether or not to move to the cloud. We would welcome an opportunity to sit down with you. Leave a comment below and visit our website to learn more at