Plan Carefully when Migrating to a Cloud Data Warehouse
Once you have decided to migrate your data warehouse to a cloud-based database, the hard and risky work of data migration begins.
- By Philip Russom
- November 7, 2019
Organizations of all sizes and maturities already have data warehouses deployed and in operation. Modernizing, upgrading, or otherwise improving an incumbent warehouse regularly involves migrating data from platform to platform, and migrations today increasingly move data from on-premises to cloud systems. This is because replatforming is a common data warehouse modernization strategy, whether you will rip-and-replace the warehouse's primary platform or augment it with additional data platforms.
Even when using an augmentation strategy for data warehouse modernization, "data balancing" is an inevitable migration task as you redistribute data across the new combination of old and new platforms.
In a related direction, some data warehouse modernization strategies simplify bloated and redundant portfolios of databases (or take control of rogue data marts and analytics sandboxes) by consolidating them onto fewer platforms, with cloud-based databases increasingly serving as a consolidation platform.
In all these modernization strategies, the cloud plays an important role. For example, many organizations have a cloud-first mandate because they know that cloud computing is the future of data center infrastructure. In addition, the cloud is a common target for data warehouse modernization because cloud-based data platforms are the most modern ones available for warehouses today.
Finally, a cloud is an easily centralized and globally available platform, which makes it an ideal target for data consolidation, as well as popular use cases such as analytics, self-service data practices, and data sharing across organizational boundaries.
Users who modernize a data warehouse need to plan carefully for the complexity, time, business disruption, risks, and costs of migrating and/or consolidating data onto cloud-based platforms suitable for data warehousing, as follows.
Avoid a big bang project. That kind of plan attempts to modernize and migrate too much too fast. The large size and complexity of deliverables raises the probability of failure. By comparison, a project plan with multiple phases will be a less risky way to achieve your goals for modernization and cloud migration. A multiphase project plan segments work into multiple manageable pieces, each with a realistic technical goal that adds discernable business value.
The first deliverable should be easy but useful. For example, successful data migration or replatforming projects should focus the first phase on a data subset or use case that is both easy to construct and in high demand by the business. Prioritize early phases so they give everyone confidence by demonstrating technical prowess and business value. Save problematic phases for later.
Cloud migration is not just for data. You are also migrating (or simply redirecting the access of) business processes, groups of warehouse end users, reports, applications, analysts, developers, and data management solutions. Your plan should explain when and how each entity will be migrated or redirected to cloud. Managers and users should be involved in planning to ensure their needs are addressed with minimal disruption to business operations.
Manage risk with contingency plans. Expect to fail, but know that segmenting work into phases has the added benefit of limiting the scope of failure. Be ready to recover from failed phases via roll back to a prior phase state. Don't be too eager to unplug the old platforms because you may need them for roll back. It is inevitable that old and new data warehouse platforms (both on premises and on clouds) will operate simultaneously for months or years depending on the size and complexity of the data, user groups, and business processes you are migrating.
Beware lift-and-shift projects. Sometimes you can "lift and shift" data from one system to another with minimal work -- but usually you cannot. Even when lift and shift works, developers need to tweak data models and interfaces for maximum performance on the new platform. A replatforming project can easily turn into a development project when data being migrated or consolidated requires considerable work.
In particular, organizations facing migrations of older applications and data to cloud platforms should assume that lift and shift will be inadequate because old and new platforms (especially when strewn across on-premises and cloud systems) will differ in terms of interfaces, tool or platform functionality, and performance characteristics. When the new platform offers little or no backward compatibility with the old one, development may be needed for platform-specific components, such as stored procedures, user-defined functions, and hand-coded routines.
Improve data, don't just move it. Problems with data quality, data modeling, and metadata should be remediated before or during migration. Otherwise you're just bringing your old problems into the new platform. In all data management work, when you move data you should also endeavor to improve data.
Assemble a diverse team for modernizing and replatforming a data warehouse. Obviously, data management professionals are required. Data warehouse modernization and replatforming usually need specialists in warehousing, integration, analytics, and reporting. When tweaks and new development are required, experts in data modeling, architecture, and data languages may be needed. Don't overlook the maintenance work required of database administrators (DBAs), systems analysts, and IT staff. Before migrating to a cloud-based data warehouse platform, consider hiring consultants or new employees who have cloud experience, not just data management experience. Finally, do not overlook the need for training employees on the new cloud platform.
Data migrations affect many types of people. Your plan should accommodate them all. A mature data warehouse will serve a long list of end users who consume reports, dashboards, metrics, analyses, and other products of data warehousing and business intelligence. These people report to a line-of-business manager and other middle managers. Affected parties (i.e., managers and sometimes end users, too) should be involved in planning a data warehouse modernization and migration to cloud. First, their input should affect the whole project from the beginning so they get what they need to be successful with the new cloud data warehouse. Second, the new platform roll-out should take into consideration the productivity and process needs of all affected parties.
Coordinate with external parties when appropriate. In some scenarios, such as those for supply chain, e-commerce, and business-to-business relationships, the plan for migration to cloud should also stipulate dates and actions for partners, suppliers, clients, customers, and other external entities. Light technical work may be required of external parties, as when customers or suppliers have online access to reports or analytics supported by a cloud data warehouse platform.
To Learn More
For more information, read the 2019 TDWI Checklist Report: Cloud at Scale for the Modern Data Warehouse, online at www.tdwi.org/checklists. Portions of this article were drawn from that Checklist.
About the Author
Philip Russom is director of TDWI Research for data management and oversees many of TDWI’s research-oriented publications, services, and events. He is a well-known figure in data warehousing and business intelligence, having published over 600 research reports, magazine articles, opinion columns, speeches, Webinars, and more. Before joining TDWI in 2005, Russom was an industry analyst covering BI at Forrester Research and Giga Information Group. He also ran his own business as an independent industry analyst and BI consultant and was a contributing editor with leading IT magazines. Before that, Russom worked in technical and marketing positions for various database vendors. You can reach him at [email protected], @prussom on Twitter, and on LinkedIn at linkedin.com/in/philiprussom.