Strategies for Data Platform Consolidation
Data platforms must play their real, architected roles in your environment. Platform consolidation can save you money and headaches.
- By William McKnight
- January 6, 2017
Many shops experience database spread -- the existence of far too many databases, often with redundant data, and often for very discrete needs. In these shops, databases are often built relatively quickly to expedite projects and applications. Once this is done repeatedly over the years and made the default go-to approach, the "data architecture" can become accidental. This is not just an application-first approach; it's an application-only approach, with the data being secondary.
Of course, this is not a good idea that anyone would openly advocate. It's borne of expediency.
Data platform consolidation may not be as sexy as starting a big data or IoT project. However, because these projects can make and save the company money, they are having a strong resurgence.
Furthermore, it's a great excuse to take advantage of cloud database offerings.
Data warehouses have a lower total cost of ownership than data marts. That's understood, but what is data warehousing in this business/financial context? It's a shared platform. It's a "build once, use many times" strategy. It means multiple business projects can use the data without having to build separate robust data layers. Allowing concurrent use of data at the data warehouse layer or creating a mart off the data warehouse is a lot less work, reduces risk, and lowers overall costs than does building from uncultivated, original source data.
Putting labels aside, the shared data approach is worth pursuing.
This could be done in anticipation of multiple projects or it could be done as a cost-reducing (and opportunity-enhancing) consolidation project.
Focusing on the latter, you could consolidate at different levels. You could just add virtualization over the top or a common semantic layer. Sometimes this is enough, but often times you need the performance and usage benefits of physical consolidation.
I've worked on many consolidation projects as well as projects that require consolidated data. The key is getting the data platforms that are going to be retained to play their real, architected roles in the environment. These architected roles can be:
- The data warehouse
- A data mart fed from the data warehouse
- An operational data store (ODS)
- A staging area
If there are many "data warehouses," you could pick one to be the one
data warehouse and consolidate the information from the others into that warehouse. This is best when one of the warehouses exhibits many of the characteristics of a good data warehouse and the others are mostly redundant.
In one environment, we picked one of the many "data warehouses" to be the one data warehouse and repurposed another data warehouse to be an ODS. In this unique case, we already had an almost-great data warehouse, but it was diluted by the existence of a competing "warehouse" which was feeding the first warehouse. We just gave the structures the right labels and use cases.
You could also add a "real" data warehouse to the environment, where previously only data marts existed and make the marts into ODSs and/or staging areas. This is necessary when none of the existing structures looks like a data warehouse. They don't enhance the source data (or model), but perhaps they do a good job getting the data out of the sources and we want to leverage that capability.
You might need to add a "real" data warehouse to the environment, where previously only data marts existed and eliminate the marts. This approach is good when many marts have proliferated without a data warehouse in place.
Organizations have been too quick to label databases "data warehouses"! Similar to the prior example, perhaps the database is really something else. Perhaps it's really a staging area. These are necessary as well in good data warehouse architecture so one time we picked one of the many "data warehouses" to be the one data warehouse and repurposed another data warehouse to be the staging area.
In the cases of repurposing, which happens frequently in data platform consolidation, pick platforms that are close to actually performing those functions today (data warehouse, ODS, staging) so the transition is smooth and often times more mental than anything else. Rearrange the pieces logically and apply the correct labels and use cases going forward.
You can make a total cost of ownership and capability impact with your data platform consolidation.
McKnight Consulting Group is led by William McKnight. He serves as strategist, lead enterprise information architect, and program manager for sites worldwide utilizing the disciplines of data warehousing, master data management, business intelligence, and big data. Many of his clients have gone public with their success stories. McKnight has published hundreds of articles and white papers and given hundreds of international keynotes and public seminars. His teams’ implementations from both IT and consultant positions have won awards for best practices. William is a former IT VP of a Fortune 50 company and a former engineer of DB2 at IBM, and holds an MBA. He is author of the book Information Management: Strategies for Gaining a Competitive Advantage with Data.